美女裸体㊙️无遮挡跳舞 ×î½ü¸üÐÂ|¸üÐÂÁбí|×Öĸ¼ìË÷|ÏÂÔØÅÅÐÐ|Æ»¹û×¨Çø|·ÖÀർº½

µ±Ç°Î»ÖãºÅ·²©abg ¡ú רÌâºÏ¼¯ ¡ú P6F3X2M7T9QJ8L1B4WZR

ËÑË÷ÖÇÄÜÌåRAGÂ䵨²»¼Ñ_UIUC¿ªÔ´s3£¬½öÐè2.4kÑù±¾£¬ÑµÁ·¿ìЧ¹ûºÃ

ËÑË÷ÖÇÄÜÌåRAGÂ䵨²»¼Ñ_UIUC¿ªÔ´s3£¬½öÐè2.4kÑù±¾£¬ÑµÁ·¿ìЧ¹ûºÃ

µ±Ç°£¬Agentic RAG£¨Retrieval-Augmented Generation£©ÕýÖð²½³ÉΪ´óÐÍÓïÑÔÄ£ÐÍ·ÃÎÊÍⲿ֪ʶµÄ¹Ø¼ü·¾¶¡£µ«ÔÚÕæÊµÊµ¼ùÖУ¬ËÑË÷ÖÇÄÜÌåµÄÇ¿»¯Ñ§Ï°ÑµÁ·²¢Î´Õ¹ÏÖ³öÔ¤ÆÚµÄÎȶ¨ÓÅÊÆ¡£Ò»·½Ã棬²¿·Ö·½·¨ÓÅ»¯µÄÄ¿±êÓëÕæÊµÏÂÓÎÐèÇó´æÔÚÆ«À룬ÁíÒ»·½Ã棬ËÑË÷Æ÷ÓëÉú³ÉÆ÷¼äµÄñîºÏÒ²Ó°ÏìÁË·º»¯Ó벿ÊðЧÂÊ¡£

ÎÒÃÇ£¨UIUC & Amazon£©Ìá³öµÄs3£¨Search-Select-Serve£©ÊÇÒ»ÖÖѵÁ·Ð§Âʼ«¸ß¡¢½á¹¹ËÉñîºÏ¡¢Éú³ÉЧ¹ûµ¼ÏòµÄ RL ·¶Ê½¡£¸Ã·½·¨Ê¹ÓÃÃûΪGain Beyond RAG (GBR)µÄ½±Àøº¯Êý£¬ºâÁ¿ËÑË÷Æ÷ÊÇ·ñÕæµÄΪÉú³É´øÀ´ÁËÓÐЧÌáÉý¡£ÊµÑé±íÃ÷£¬s3 ÔÚʹÓýö2.4k ѵÁ·Ñù±¾µÄÇé¿öÏ£¬±ãÔÚ¶à¸öÁìÓòÎÊ´ðÈÎÎñÖг¬Ô½ÁËÊý¾Ý¹æÄ£´ó°Ù±¶µÄÇ¿»ùÏߣ¨Èç Search-R1¡¢DeepRetrieval£©¡£

ÂÛÎıêÌ⣺s3: You Don¡¯t Need That Much Data to Train a Search Agent via RLÂÛÎÄÁ´½Ó£ºhttps://arxiv.org/pdf/2505.14146´úÂë²Ö¿â£ºhttps://github.com/pat-jj/s3

Ñо¿¶¯»ú

RAG µÄ·¢Õ¹¹ì¼££º´Ó¾²Ì¬¼ìË÷µ½ Agentic ²ßÂÔ

ÎÒÃǽ« RAG ϵͳµÄ·¢Õ¹·ÖΪÈý½×¶Î£º

1.Classic RAG£ºÊ¹Óù̶¨ query¡¢BM25 µÈ retriever£¬Éú³ÉÆ÷¶Ô½á¹ûÎÞ·´À¡£»

2.Pre-RL-Zero Active RAG£ºÒýÈë¶àÂÖ query ¸üУ¬Èç IRCoT¡¢Self-RAG µÈ£¬²¿·Öͨ¹ý prompt Òýµ¼ LLM ¼ìË÷ÐÂÐÅÏ¢¡£Self-RAG ½øÒ»²½Í¨¹ýÕôÁó´óÐÍÄ£Ð͵ÄÐÐΪ£¬ÑµÁ·Ð¡Ä£ÐÍÄ£Äâ¶àÂÖËÑË÷ÐÐΪ£»

3.RL-Zero ½×¶Î£ºÇ¿»¯Ñ§Ï°¿ªÊ¼ÓÃÓÚÇý¶¯¼ìË÷ÐÐΪ£¬´ú±í·½·¨È磺

DeepRetrieval£ºÒÔ Recall¡¢NDCG µÈËÑË÷Ö¸±êΪÓÅ»¯Ä¿±ê£¬×¨×¢ÓÚ¼ìË÷Æ÷±¾ÉíµÄÄÜÁ¦£»Search-R1£º½«¼ìË÷ÓëÉú³ÉÁªºÏ½¨Ä££¬ÒÔ×îÖÕ´ð°¸ÊÇ·ñ Exact Match ×÷Ϊǿ»¯Ðźţ¬ÓÅ»¯ÕûºÏʽµÄËÑË÷ - Éú³É²ßÂÔ¡£

¾¡¹Ü RL ·½·¨ÔÚ˼·Éϸü¾ßÖ÷¶¯ÐÔÓë½»»¥ÐÔ£¬µ«ÔÚʵ¼ÊÂ䵨ÖÐÈÔÃæÁÙÖî¶àÌôÕ½¡£

µ±Ç° RL-based Agentic RAG Â䵨±íÏÖ²»¼ÑµÄÔ­Òò

ÎÒÃǶԵ±Ç° Agentic RAG ·½°¸Ð§¹û²»Îȶ¨¡¢ÑµÁ·ÄÑ¡¢Ç¨ÒÆÄÜÁ¦ÈõµÄÔ­Òò£¬¹éÄÉΪÈýµã£º

1. ÓÅ»¯Ä¿±êÆ«ÀëÕæÊµÏÂÓÎÈÎÎñ

Search-R1 µÈ·½·¨²ÉÓÃExact Match (EM)×÷ΪÖ÷Òª½±ÀøÖ¸±ê£¬¼´´ð°¸ÊÇ·ñÓë²Î¿¼´ð°¸×ÖÃæÒ»Ö¡£ÕâÒ»Ö¸±ê¹ýÓÚ¿Á¿Ì¡¢¶ÔÓïÒå±äÌå²»Ãô¸Ð£¬ÔÚѵÁ·³õÆÚÐźÅÏ¡Ê裬ÈÝÒ×µ¼ÖÂÄ£ÐÍÓÅ»¯¡¸´ð°¸ token ¶ÔÆë¡¹¶ø·ÇËÑË÷ÐÐΪ±¾Éí

ÀýÈ磬¶ÔÓÚÎÊÌ⡸ÃÀ¹úµÚ 44 ÈÎ×ÜͳÊÇË­£¿¡¹£¬

»Ø´ð¡¸Barack Obama¡¹£º?»Ø´ð¡¸The 44th president was Barack Obama.¡¹£º?£¨EM=0£©

ÕâÖÖ²»ºÏÀíµÄÐźŻáÓÕµ¼Ä£ÐÍÔÚÉú³É½×¶Î×ö¸ñʽ²¹³¥£¬´Ó¶øÎÞ·¨·´Ó³ËÑË÷²ßÂÔ±¾ÉíÊÇ·ñÓÐЧ

2. ¼ìË÷ÓëÉú³ÉñîºÏ£¬¸ÉÈÅËÑË÷ÓÅ»¯

½«Éú³ÉÄÉÈëѵÁ·Ä¿±ê£¨Èç Search-R1£©£¬ËäÈ»¿ÉÒÔÌáÉýÕûÌå´ð°¸×¼È·ÂÊ£¬µ«Ò²»á´øÀ´ÎÊÌ⣺

ÎÞ·¨ÅжÏÐÔÄÜÌáÉý¾¿¾¹À´×Ô¡¸¸üºÃµÄËÑË÷¡¹£¬»¹ÊÇ¡¸¸üÇ¿µÄÓïÑÔÉú³É¶ÔÆëÄÜÁ¦¡¹£»¶Ô LLM ²ÎÊýÒÀÀµÇ¿£¬²»ÀûÓÚÄ£ÐÍÇ¨ÒÆ»ò¼¯³É£»Î¢µ÷´óÄ£Ðͳɱ¾¸ß£¬ÏÞÖÆÁËѵÁ·Ð§ÂʺÍÄ£¿éÌæ»»µÄÁé»îÐÔ¡£

3. ÏÖÓÐÆÀ¼Û±ê×¼ÎÞ·¨×¼È·ºâÁ¿ËÑË÷¹±Ï×

EM¡¢span match µÈ´«Í³ QA Ö¸±êÖ÷Òª¹Ø×¢Êä³ö½á¹û£¬ÓëËÑË÷ÖÊÁ¿¹ØÁªÓÐÏÞ¡£¶ø search-oriented Ö¸±ê£¨Èç Recall@K£©Ëä¿É¶ÈÁ¿ retriever ÐÔÄÜ£¬È´ÎÞ·¨ÌåÏÖÕâЩÐÅÏ¢ÊÇ·ñÕæµÄ±»Ä£ÐÍ¡¸Óúṡ£ÕâЩƫ²îÖ±½Óµ¼ÖÂÏÖÓÐ RL Agentic RAG ·½·¨ÔÚÆÀ¹À¡¢ÑµÁ·ºÍ·º»¯ÉϾù´æÔÚÆ¿¾±¡£

s3 - רעËÑË÷Ч¹ûÓÅ»¯µÄ search agent RL ѵÁ·¿ò¼Ü

s3 µÄ³ö·¢µãºÜ¼òµ¥

Èç¹ûÎÒÃÇÕæÕý¹ØÐĵÄÊÇ¡¸ËÑË÷ÌáÉýÁËÉú³ÉЧ¹û¡¹£¬ÄǾÍÓ¦¸ÃֻѵÁ·ËÑË÷Æ÷¡¢¶³½áÉú³ÉÆ÷£¬²¢ÒÔÉú³É½á¹ûÌáÉýΪ½±Àø

Õâ±ãÊÇ¡¸Gain Beyond RAG£¨GBR£©¡¹µÄ¶¨Ò壺

¼´£ºÓà s3 ËÑË÷µ½µÄÉÏÏÂÎÄι¸ø Frozen Generator Ö®ºóµÄÉú³ÉЧ¹û£¬Ïà±È³õʼµÄ top-k ¼ìË÷½á¹ûÊÇ·ñ¸üºÃ¡£ÖµµÃ×¢ÒâµÄÊÇ£¬s3 ѵÁ·Ê±Ê¼ÖÕ³õʼ»¯ÓÚÏàͬµÄԭʼ query£¬´Ó¶øÄÜÇåÎú¶Ô±È s3 ¼ìË÷¶Ô½á¹û´øÀ´µÄÕæÊµ¡¸ÔöÒæ¡¹¡£

׼ȷÂÊ£¨Acc£©ÆÀ¹À±ê×¼

ÎÒÃDzÉÓÃÁ˸üÓïÒåÓѺõÄGeneration Accuracy£¨GenAcc£©Ö¸±ê¡£Ëü½áºÏÁËÁ½ÖÖ»úÖÆ£º

Span Match£ºÅжÏÉú³É´ð°¸ÊÇ·ñ°üº¬²Î¿¼´ð°¸µÄÈÎÒâ token spanLLM Judge£ºÓÉÒ»¸öÇáÁ¿ LLM Åжϴð°¸ÊÇ·ñÓïÒåÕýÈ·

Á½ÕßÖ»ÒªÈÎÒâÒ»¸öͨ¹ý£¬ÔòÊÓΪÕýÈ·¡£ÕâÒ»Ö¸±êÔÚÈ˹¤¶Ô±ÈÖÐÓëÈËÀàÅжÏÒ»ÖÂÂʸߴï96.4%£¬Ïà±È֮ϣ¬EM ½öΪ15.8%

ѵÁ·ÓëÓÅ»¯ - ½öÐè 2.4k Ñù±¾¼´¿ÉÍê³É ppo ѵÁ·:

ÎÒÃDzÉÓà PPO ½øÐвßÂÔÓÅ»¯¡£ÎªÁËÌáÉýѵÁ·Ð§ÂÊ£º

ÎÒÃÇԤɸ³ýµôÁË¡¸naive RAG ¾ÍÄÜ´ð¶Ô¡¹µÄÑù±¾£»½«ÑµÁ·Ñù±¾¼¯ÖÐÔÚÐèÒªÕæÕý¼ìË÷µÄÐÂÐÅÏ¢µÄÈÎÎñÉÏ£»Generator ÍêÈ«¶³½á£¬ÑµÁ·´ú¼ÛÍêÈ«¼¯ÖÐÔÚ Searcher¡£

s3 ѵÁ·×Üʱ¼äÖ»Ðè 114 ·ÖÖÓ£¨vs Search-R1 µÄ 3780 ·ÖÖÓ£©£¬Êý¾ÝÒ²¼õÉÙÔ¼ 70 ±¶¡£

ʵÑé·ÖÎö

General QA w/ RAG

ʵÑéÒ»£ºÍ¨Óà QA ÈÎÎñ£¬s3 ÓÅÓÚ Search-R1 ºÍ DeepRetrieval¡£

ÎÒÃÇÔÚÁù¸öͨÓÃÊý¾Ý¼¯ÉÏÆÀ¹ÀÁË Direct Inference¡¢Naive RAG¡¢IRCoT¡¢DeepRetrieval¡¢Search-o1¡¢Search-R1 ÒÔ¼° s3 µÄÐÔÄÜ¡£ÊµÑéÖУ¬ÎÒÃÇʹÓÃÁ˲»Í¬µÄÏÂÓÎ LLM£¬°üÀ¨ Qwen2.5-7B-Instruct£¬Qwen2.5-14B-Instruct ºÍ Claude-3-Haiku¡£

¾¡¹Ü s3 ½öʹÓÃÁË 2.4k Ìõ NQ+HotpotQA ѵÁ·Êý¾Ý£¨training source ºÍ Search-R1 Ò»Ñù£©£¬ËüÔÚÆäÖÐÎå¸öÊý¾Ý¼¯ÉÏʵÏÖÁË×îÓűíÏÖ£¬Õ¹ÏÖ³öÏÔÖøµÄ·º»¯ÄÜÁ¦¡£

Medical QA w/ RAG

ʵÑé¶þ£ºÒ½Ñ§ QA ÈÎÎñ£¬s3 Õ¹ÏÖ¾ªÈ˵ĿçÁìÓòÄÜÁ¦

ÎÒÃÇËæºóÔÚÎå¸öҽѧÁìÓòµÄ QA Êý¾Ý¼¯ÉϽøÒ»²½ÆÀ¹ÀÁËÄ£ÐÍÐÔÄÜ£¬²âÊÔʹÓÃÁËÁ½¸öÓïÁϿ⣺Wikipedia2018£¨ÓëͨÓòâÊÔÒ»Ö£©ºÍ MedCorp£¨ACL 2024£©¡£½á¹ûÏÔʾ£¬Search-R1 ÔÚÆäѵÁ·ÓïÁÏÉϱíÏÖÁ¼ºÃ£¬µ«ÔÚÓïÁϱä¸üºóÏÔÏÖ³ö¹ýÄâºÏÇ÷ÊÆ£»Ïà±È֮ϣ¬s3 ÄÜÎȶ¨Ç¨ÒÆÖÁ²»Í¬µÄÊý¾Ý¼¯ÓëÓïÁϿ⣬͹ÏÔ³öÆä»ùÓÚ searcher-only ÓÅ»¯²ßÂÔµÄÇ¿·º»¯ÄÜÁ¦¡£

reward ÓÅ»¯ÇúÏß

ͼ 5 չʾÁËÎÒÃÇµÄ reward ÇúÏߣ¬¿ÉÒÔ¿´³ö s3 ÔÚ½Ó½ü 10 ¸öѵÁ·²½Ö裨batch size Ϊ 120£©ÄÚ±ãѸËÙ¡¸ÊÕÁ²¡¹¡£ÕâÒ»ÏÖÏóÖ§³ÖÁ½¸öÍÆ¶Ï£º£¨1£©Ô¤ÑµÁ·ÓïÑÔÄ£Ðͱ¾ÉíÒѾ߱¸Ò»¶¨µÄËÑË÷ÄÜÁ¦£¬ÎÒÃÇÖ»Ðèͨ¹ýºÏÀíµÄ·½Ê½¡¸¼¤»î¡¹ÕâÖÖÄÜÁ¦£»£¨2£©ÔÚÒ»¶¨·¶Î§ÄÚ£¬Êʵ±Ôö¼ÓÿÂÖËÑË÷µÄÎĵµÊýÁ¿ºÍ×î´óÂÖ´ÎÊý£¬ÓÐÖúÓÚÌáÉý×îÖÕÐÔÄÜ¡£

ÏûÈÚʵÑé

ÔÚ²»Í¬ÅäÖÃÏ£¬ÒƳý×é¼þ¶ÔÐÔÄܵÄÓ°Ï죨ƽ¾ù׼ȷÂÊ£©¡£ÎÒÃÇʹÓÃÁËÈý×éÉ趨½øÐжԱȣ¬½á¹û±íÃ÷ s3 µÄÉè¼ÆÔÚ׼ȷÐÔÓëЧÂÊÖ®¼ä´ïµ½ÁË×îÓÅÆ½ºâ¡£

ÎÒÃǽøÒ»²½Í¨¹ýÏûÈÚʵÑ飬ÑéÖ¤ÁË s3 ¿ò¼ÜÖÐÁ½¸ö¹Ø¼üÉè¼ÆµÄ±ØÒªÐÔ£º

¡¸´ÓԭʼÎÊÌ⿪ʼ¼ìË÷¡¹ÊÇ·½ÏòÕýÈ·µÄ±£ÕÏ£ºÎÒÃÇ·¢ÏÖ£¬ÒÔÓû§Ô­Ê¼ÎÊÌâ×÷ΪµÚÒ»ÂÖ¼ìË÷µÄÆðµã£¬ÓÐÖúÓÚÄ£ÐÍÃ÷È·ËÑË÷Ä¿±ê¡¢½¨Á¢ÓÐЧµÄ¼ìË÷·¾¶¡£Èô²»ÉèÖÃÕâÒ»³õʼµã£¬ËÑË÷²ßÂÔÍùÍùÆ«ÀëÖ÷Ì⣬µ¼ÖÂÐÔÄÜÏÔÖøÏ½µ¡£¡¸ÎĵµÑ¡Ôñ¡¹»úÖÆÏÔÖø½µµÍ token ÏûºÄ£º¸Ã»úÖÆÔÊÐíÄ£ÐÍÔÚÿÂÖ¼ìË÷ºóÖ÷¶¯É¸Ñ¡ÐÅÏ¢£¬´Ó¶ø±ÜÃ⽫ËùÓмìË÷½á¹ûÒ»¹ÉÄÔËÍÈëÉú³ÉÆ÷¡£Í¨¹ýÕâÒ»Éè¼Æ£¬s3 µÄÊäÈë token ƽ¾ù¼õÉÙÁË 2.6 ÖÁ 4.2 ±¶£¬²»½öÌáÉýÁËЧÂÊ£¬Ò²¼õÉÙÁËÔëÉù¸ÉÈÅ£¬¶ÔÉú³ÉЧ¹ûÓÐÕýÃæ×÷Óá£

×ÜÌåÀ´¿´£¬s3 Éè¼ÆÖеġ¸Æðµã³õʼ»¯ + ¶¯Ì¬Ñ¡Ôñ¡¹ÊÇÖ§³ÅÆä¸ßЧ¡¢Ç¿·º»¯ÐÔÄܵĹؼü¡£¼´Ê¹ÔÚijЩÊý¾Ý¼¯ÉÏͨ¹ýÔö¼ÓÊäÈëÄÚÈÝÄÜ»ñµÃ¶ÌÆÚÔöÒæ£¬s3 ԭʼ½á¹¹ÔÚѵÁ·Ð§ÂÊ¡¢ÍÆÀíËÙ¶ÈÓëÉú³É׼ȷÂÊÉÏÒÀȻչÏÖ³ö¸üÎȶ¨µÄÓÅÊÆ¡£

FAQ

Q1£ºÎªÊ²Ã´ÎÒÃDZ¨¸æµÄ Search-R1 ½á¹ûÓëÔ­ÂÛÎIJ»Ò»Ö£¿

A1£ºSearch-R1 Ô­ÎÄʹÓà Exact Match£¨EM£©×÷Ϊ reward ºÍÆÀ¹ÀÖ¸±ê£¬²¢¶ÔÄ£ÐͽøÐÐÁËÕë¶ÔÐÔ΢µ÷¡£½«ÕâÖÖÕë¶Ô EM ÓÅ»¯µÄÄ£ÐÍ£¬ÓëÆäËû zero-shot ·½·¨±È½Ï£¬ÂÔÏÔ²»¹«Æ½£¬Ò²ÄÑÒÔºâÁ¿ËÑË÷±¾ÉíµÄЧ¹û¡£Òò´ËÎÒÃDzÉÓøüÓïÒåÓÑºÃµÄ Generation Accuracy£¨GenAcc£©£¬½áºÏ span Æ¥ÅäºÍ LLM Åжϣ¬ÓëÈËÀàÆÀ¹ÀÒ»ÖÂÂÊ´ï 96.4%¡£Ïà±È֮ϣ¬EM Ö»Äܲ¶×½×ÖÃæÒ»Ö£¬·´¶øÈÝÒ×Îóµ¼Ä£ÐÍÓÅ»¯·½Ïò¡£

Q2£ºs3 Ϊʲô²»ÑµÁ·Éú³ÉÆ÷£¿ÕâÑùÊÇ·ñÏÞÖÆÁËÄ£ÐÍÐÔÄÜ£¿

A2£ºÎÒÃÇÉè¼Æ s3 µÄºËÐÄÀíÄîÊÇ£ºÈç¹ûÎÒÃÇÏëÕæÕýÓÅ»¯ËÑË÷Ч¹û£¬²»Ó¦ÈÃÉú³ÉÆ÷±»ÑµÁ·£¬·ñÔò»á»ìÏý¡¸ËÑË÷±äºÃ¡¹Ó롸ÓïÑÔÄ£ÐͱäÇ¿¡¹´øÀ´µÄÔöÒæ¡£¶³½áÉú³ÉÆ÷²»½öÌáÉýÁËѵÁ·Ð§ÂÊ£¨½ÚÊ¡´óÄ£ÐÍ΢µ÷³É±¾£©£¬Ò²±ãÓÚÄ£ÐÍÇ¨ÒÆµ½²»Í¬ÈÎÎñÓëÉú³ÉÆ÷£¬ÕæÕý×öµ½¡¸ËÑË÷ÄÜÁ¦¼´²å¼´Óṡ£

Ïà¹ØÍÆ¼ö£º高清乱码🔞❌❌❌粪便 黑卡蒂裸体无打码 新田雪主演在线观看

·ÖÏí£º 2025-06-21 18:06:56 ¹²81¿î

µçÄÔ

°²×¿

Æ»¹û

Ïà¹ØºÏ¼¯

ÍøÓÑÆÀÂÛ ²é¿´ËùÓÐÆÀÂÛ>>

·¢±íÆÀÂÛ

(ÄúµÄÆÀÂÛÐèÒª¾­¹ýÉóºË²ÅÄÜÏÔʾ) ÍøÓÑ·ÛË¿QQȺºÅ:766969941

²é¿´ËùÓÐ0ÌõÆÀÂÛ>>

¡¾ÍøÕ¾µØÍ¼¡¿¡¾sitemap¡¿