[具身智能] Awesome-Embodied-AI-yunlongdong

张启昊
张启昊
发布于 2024-06-14 / 73 阅读
0
0

[具身智能] Awesome-Embodied-AI-yunlongdong

Awesome-Embodied-AI

Scene Understanding

General Planner

DescriptionPaper
MSSegmentationhttps://arxiv.org/abs/2306.17582

Image

DescriptionPaperCode
SAMSegmentationhttps://arxiv.org/abs/2304.02643https://github.com/facebookresearch/segment-anything
YOLO-WorldOpen-Vocabulary Detectionhttps://arxiv.org/abs/2401.17270https://github.com/AILab-CVC/YOLO-World

Point Cloud

DescriptionPaperCode
SAM3DSegmentationhttps://arxiv.org/abs/2306.03908https://github.com/Pointcept/SegmentAnything3D
PointMixerUnderstandinghttps://arxiv.org/abs/2401.17270https://github.com/LifeBeyondExpectations/PointMixer

Multi-Modal Grounding

DescriptionPaperCode
GPT4VMLM(Image+Language->Language)https://arxiv.org/abs/2303.08774
Claude3-OpusMLM(Image+Language->Language)https://www.anthropic.com/news/claude-3-family
GLaMMPixel Groundinghttps://arxiv.org/abs/2311.03356https://github.com/mbzuai-oryx/groundingLMM
All-SeeingPixel Groundinghttps://arxiv.org/abs/2402.19474https://github.com/OpenGVLab/all-seeing
LEO3Dhttps://arxiv.org/abs/2311.12871https://github.com/embodied-generalist/embodied-generalist

Data Collection

From Video

DescriptionPaperCode
Vid2Robothttps://vid2robot.github.io/vid2robot.pdf
RT-Trajectoryhttps://arxiv.org/abs/2311.01977
MimicPlayhttps://mimic-play.github.io/assets/MimicPlay.pdfhttps://github.com/j96w/MimicPlay

Hardware

DescriptionPaperCode
UMITwo-Fingershttps://arxiv.org/abs/2402.10329https://github.com/real-stanford/universal_manipulation_interface
DexCapFive-Fingershttps://dex-cap.github.io/assets/DexCap_paper.pdfhttps://github.com/j96w/DexCap
HIRO HandHand-over-handhttps://sites.google.com/view/hiro-hand

Generative Simulation

DescriptionPaperCode
MimicGenhttps://arxiv.org/abs/2310.17596https://github.com/NVlabs/mimicgen_environments
RoboGenhttps://arxiv.org/abs/2311.01455https://github.com/Genesis-Embodied-AI/RoboGen

Action Output

Generative Imitation Learning

DescriptionPaperCode
Diffusion Policyhttps://arxiv.org/abs/2303.04137https://github.com/real-stanford/diffusion_policy
ACThttps://arxiv.org/abs/2304.13705https://github.com/tonyzhaozh/act

Affordance Map

DescriptionPaperCode
CLIPortPick&Placehttps://arxiv.org/pdf/2109.12098.pdfhttps://github.com/cliport/cliport
AffordPosehttps://arxiv.org/abs/2309.08942https://github.com/GentlesJan/AffordPose
Robo-AffordancesContact&Post-contact trajectorieshttps://arxiv.org/abs/2304.08488https://github.com/shikharbahl/vrb
Robo-ABChttps://arxiv.org/abs/2401.07487https://github.com/TEA-Lab/Robo-ABC
Where2ExploreFew shot learning from semantic similarityhttps://proceedings.neurips.cc/paper_files/paper/2023/file/0e7e2af2e5ba822c9ad35a37b31b5dd4-Paper-Conference.pdf
Move as You Say, Interact as You CanAffordance to motion from diffusion modelhttps://arxiv.org/pdf/2403.18036.pdf
AffordanceLLMGrounding affordance with LLMhttps://arxiv.org/pdf/2401.06341.pdf
Environment-aware Affordancehttps://proceedings.neurips.cc/paper_files/paper/2023/file/bf78fc727cf882df66e6dbc826161e86-Paper-Conference.pdf
OpenADOpen-Voc Affordance Detection from point cloudhttps://www.csc.liv.ac.uk/~anguyen/assets/pdfs/2023_OpenAD.pdfhttps://github.com/Fsoft-AIC/Open-Vocabulary-Affordance-Detection-in-3D-Point-Clouds
RLAffordEnd-to-End affordance learning with RLhttps://gengyiran.github.io/pdf/RLAfford.pdf
General FlowCollect affordance from videohttps://general-flow.github.io/general_flow.pdfhttps://github.com/michaelyuancb/general_flow
PreAffordancePre-grasping planninghttps://arxiv.org/pdf/2404.03634.pdf
ScenFun3dFine-grained functionality&affordance in 3D scenehttps://aycatakmaz.github.io/data/SceneFun3D-preprint.pdfhttps://github.com/SceneFun3D/scenefun3d

Question&Answer from LLM

DescriptionPaperCode
COPAhttps://arxiv.org/abs/2403.08248
ManipLLMhttps://arxiv.org/abs/2312.16217
ManipVQAhttps://arxiv.org/pdf/2403.11289.pdfhttps://github.com/SiyuanHuang95/ManipVQA

Language Corrections

DescriptionPaperCode
OLAFhttps://arxiv.org/pdf/2310.17555
YAYRobothttps://arxiv.org/abs/2403.12910https://github.com/yay-robot/yay_robot

Planning from LLM

DescriptionPaperCode
SayCanAPI Levelhttps://arxiv.org/abs/2204.01691https://github.com/google-research/google-research/tree/master/saycan
VILAPrompt Levelhttps://arxiv.org/abs/2311.17842

评论