• ? ios°æ-ÌìÏÂ24Сʱ¿ÕսʤÎñƽ̨Èí¼þ-ÆÕ¾©Èô±»°ó¼Ü£¬£¬£¬£¬ÃÀ×ÜÍ³ÌØÀÊÆÕ»ØÓ¦£¬£¬£¬£¬¶íÂÞ˹ÊÕµ½Ã÷È·ÖÒÑÔ

    ¼øºÚµ£±£Íø

    ɽÎ÷ÐÂÎÅÍø

    ×îÐÂAPP

    ÈÈÃÅAPP

    • 75ËêÁõÏþÇìÔÙÑÝÎäÔòÌ죡ÍõÕ¨×éºÏ32ÄêºóÔÙ¾ÛÊ×£¬£¬£¬£¬ÕùÒéÓëÇ黳¶¼À­Âú

      ÖÂδÀ´¿Æ´´Ê×ÄÔ£¬£¬£¬£¬¸Û¿Æ´ó(¹ãÖÝ£©MBA+

      ÕâÊÇÒ»¸ö¹ØÓÚ AI µ×²ãÂß¼­Öع¹µÄʱ¿Ì¡£¡£¡£¡£ºã¾ÃÒÔÀ´£¬£¬£¬£¬Transformer ¼Ü¹¹±»À§ÔÚÒ»¸öÌÚ¹óµÄã£ÂÛÖУºÎÒÃÇÓÃ×Å×îÏȽøµÄ GPU ËãÁ¦£¬£¬£¬£¬È¥Èà AI Ä£×Ó " ËÀ¼ÇÓ²±³ " ÄÇЩ²é×Öµä¾ÍÄÜÖªµÀµÄ¾²Ì¬ÖªÊ¶¡£¡£¡£¡£DeepSeek ÁºÎÄ·æÍŶÓÓëÆä±±´óÏàÖúÕßÔÚ½ñÈÕÆÆÏþÐû²¼µÄÖØ°õÂÛÎÄ¡¶Conditional Memory via Scalable Lookup¡·£¬£¬£¬£¬³¹µ×Í»ÆÆÁËÕâÒ»½©¾Ö¡£¡£¡£¡£ËûÃÇÌá³öÁËÒ»ÖÖȫеÄEngram£¨Ó¡¼££©Ä£¿£¿£¿é£¬£¬£¬£¬ÔڹŰåµÄ " Ìõ¼þÅÌËã "£¨MoE£©Ö®Í⣬£¬£¬£¬¿ª·¢Á˵ڶþÌõÏ£º±»¯Õ½Ïß¡ª¡ª" Ìõ¼þÓ°Ïó "¡£¡£¡£¡£Õâ²»µ«ÊÇÒ»´ÎÊÖÒÕÐÞ²¹£¬£¬£¬£¬¶øÊÇÒ»³¡¹ØÓÚÄ£×Ó " ÄÔÈÝÁ¿ " µÄ¹©Ó¦²àˢС£¡£¡£¡£Ëü֤ʵÎú£ºµ±ÎÒÃǽ« " Ó°Ïó " ´Ó " ÅÌËã " ÖаþÀ룬£¬£¬£¬°Ñ¸Ã±³µÄ½»¸ø " ×Öµä "£¬£¬£¬£¬°Ñ¸ÃËãµÄ½»¸ø´óÄÔ£¬£¬£¬£¬AI µÄÍÆÀíÄÜÁ¦½«Ó­À´·´Ö±¾õµÄ±¬·¢Ê½ÔöÌí¡£¡£¡£¡£DeepSeek ÍýÏëÔÚ 2 Ô´º½ÚǰºóÕýʽÐû²¼ V4£¬£¬£¬£¬¶øÕâÒ»¿Ì»òÐí¾ÍÊÇ DeepSeek V4 ½µÉúµÄǰҹ¡£¡£¡£¡£ ÐòÕ£ºÁù²ãÉñ¾­ÍøÂçµÄ " ÎÞÓù¦ "¹ÊÊÂµÄÆðµã£¬£¬£¬£¬Ô´ÓÚ DeepSeek ÍÅ¶Ó¶Ô Transformer ÄÚ²¿ÔË×÷»úÖÆµÄÒ»´Î " ºË´Å¹²Õñ " ɨÃè¡£¡£¡£¡£ÔÚÈ˹¤ÖÇÄܵĺںÐ×ÓÀ£¬£¬£¬µ±´óÄ£×Ó¿´µ½ "Diana, Princess of Wales"£¨´÷°²ÄÈ£¬£¬£¬£¬Íþ¶ûÊ¿Íõåú£©Õâ¸ö¶ÌÓïʱ£¬£¬£¬£¬ËüµÄÄÚ²¿±¬·¢ÁËÒ»³¡ÁîÈ˷ѽâÇÒ¼«ÆäÌÚ¹óµÄ " ÄÚÚ§ "¡£¡£¡£¡£Ñо¿Ö°Ô±·¢Ã÷£¬£¬£¬£¬ÎªÁËʶ±ðÕâ¸öÀο¿µÄʵÌ壬£¬£¬£¬Ä£×Ó¾¹È»¶¯ÓÃÁËÕûÕû 6 ²ãÍøÂ磺µÚ 1-2 ²ã£ºÄ£×Ó»¹ÔÚ×ÁÄ¥ "Wales" »òÐíÊÇÒ»¸ö¹ú¼Ò £»£»£»µÚ 3 ²ã£ºËüÒâʶµ½ÕâÊÇÅ·ÖÞµÄÒ»¸öµØÀí¿´·¨ £»£»£»µÚ 4 ²ã£ºËü×îÏÈÆ´¼¯³ö "Princess of Wales" ËÆºõÊÇÒ»¸öÍ·ÏÎ £»£»£»µÚ 5 ²ã£ºËüåÚÏëµ½ÁË " Íþ¶ûÊ¿Ç×ÍõµÄÆÞ×Ó " £»£»£»µÚ 6 ²ã£ºÖ±µ½ÕâÀ£¬£¬£¬Ëü²ÅÖÕÓÚÈ·ÈÏ£¬£¬£¬£¬ÕâÊÇÖ¸ÄÇÎ»ÖøÃûµÄ " ´÷°²ÄÈÍõåú "¡£¡£¡£¡£ÔÚһλ׷Çó¼«ÖÂЧÂʵļܹ¹Ê¦ÑÛÖУ¬£¬£¬£¬Õâ¼òÖ±ÊÇËãÁ¦µÄ±©éåÌìÎï¡£¡£¡£¡£" ´÷°²ÄÈÍõåú " ÊÇÒ»¸ö¿Í¹Û±£´æµÄ¡¢¾²Ì¬µÄʵÌ壬£¬£¬£¬Ëü²»»áÓÉÓÚÉÏÏÂÎĵÄת±ä¶ø¸Ä±äÆäʵÖÊ¡£¡£¡£¡£ÎªÁËÌáÈ¡Õâ¸öÔ­À´²é×Öµä¾ÍÄÜÖªµÀµÄÊÂʵ£¬£¬£¬£¬Transformer ¾¹È»¶¯ÓÃÁËÕûÕû 6 ²ãÉî¶ÈµÄÌÚ¹ó¾ØÕóÔËËãÈ¥ " ÖØÐÞ " Õâ¸ö¿´·¨¡£¡£¡£¡£Õâ¾ÍÏñÊÇÒ»¸ö¾øÊÀÌì²Å£¬£¬£¬£¬ÔÚÈ¥½â¾ö΢»ý·ÖÄÑÌâ֮ǰ£¬£¬£¬£¬Ã¿´Î¶¼µÃÏÈ»¨°ëСʱĬдһ±é¾Å¾Å³Ë·¨±í¡£¡£¡£¡£ ÕâÖÖ " ÒþʽӰÏó " µÄ»úÖÆ£¬£¬£¬£¬ÆÈʹģ×Ó½«Ãû¹óµÄ²ÎÊýÈÝÁ¿ºÍÍøÂçÉî¶È£¬£¬£¬£¬ÆÌÕÅÔÚÁ˼òÆÓµÄģʽƥÅäÉÏ¡£¡£¡£¡£DeepSeek ÔÚÕâÆª³¤´ï 33 Ò³µÄÂÛÎÄÖУ¬£¬£¬£¬Ìá³öÁËÒ»¸öÖ±»÷Áé»êµÄ¿½ÎÊ£ºÎªÊ²Ã´²»Ö±½Ó¸ø´óÄ£×ÓÅäÒ»±¾¿ÉÒÔËæ²éËæÓÃµÄ " ³¬µÈ×Öµä "£¿£¿£¿ µÚÒ»Õ£º¼Ü¹¹ÖØËÜ¡ª¡ª Engram Ä£¿£¿£¿éµÄ±©Á¦ÃÀѧΪÏàʶ¾öÕâ¸öÎÊÌ⣬£¬£¬£¬DeepSeek Ìá³öÁËÒ»ÖÖÃûΪ "Engram£¨Ìõ¼þÓ°Ïó£©" µÄÈ«ÐÂÄ£¿£¿£¿é¡£¡£¡£¡£ÈôÊÇ˵ MoE£¨»ìÏýר¼ÒÄ£×Ó£©ÊÇ°Ñ " ´óÄÔ " ·Ö³ÉÁ˲î±ðµÄÇøÓò£¬£¬£¬£¬Èòî±ðµÄר¼ÒÈÏÕæ²î±ðµÄ˼Ë÷£¨Ìõ¼þÅÌË㣩 £»£»£»ÄÇô Engram ¾ÍÊǸø´óÄÔÍâ¹ÒÁËÒ»¸öÖØ´óµÄ " º£ÂíÌå "£¬£¬£¬£¬×¨ÃÅÈÏÕæ´æ´¢¾²Ì¬ÖªÊ¶£¨Ìõ¼þÓ°Ï󣩡£¡£¡£¡£1. ¸´Éú "N-gram"£º´Ó¹ÅÀÏÖÇ»ÛÖÐѰÕÒÃÕµ×Engram µÄ½¹µãÁé¸Ð£¬£¬£¬£¬¾¹È»À´×ÔÓÚ NLP£¨×ÔÈ»ÓïÑÔ´¦Àí£©ÁìÓòµÄ " ÉϹÅÉñÆ÷ " ¡ª¡ª N-gram¡£¡£¡£¡£ÔÚÉî¶ÈѧϰͳÖÎÌìÏÂ֮ǰ£¬£¬£¬£¬ÎÒÃǾÍÊÇ¿¿Í³¼Æ "N ¸ö´Êͬʱ·ºÆðµÄ¸ÅÂÊ " À´Ã÷È·ÓïÑԵġ£¡£¡£¡£DeepSeek ½«ÕâÒ»¾­µä¿´·¨¾ÙÐÐÁËÏÖ´ú»¯µÄħ¸Ä£º¹Å°åµÄ Transformer£ºÖªÊ¶ÊèÉ¢ÔÚÉñ¾­ÔªµÄÈ¨ÖØ£¨Weights£©À£¬£¬£¬Ìáȡ֪ʶÐèÒª¾­ÓÉÖØ´óµÄÏßÐÔ²ãÅÌË㣬£¬£¬£¬ÖØÆ¯ºó¸ß¡£¡£¡£¡£Engram Ä£¿£¿£¿é£ºËüÊÇÒ»¸öÖØ´óµÄ¡¢¿ÉÀ©Õ¹µÄǶÈë±í£¨Embedding Table£©¡£¡£¡£¡£µ±Ä£×Ó¶Áµ½ " ÕÅÖÙ¾° " »òÕß " ËÄ´ó·¢Ã÷ " ÕâÖÖÀο¿´îÅ䣨N-gram£©Ê±£¬£¬£¬£¬²»ÐèÒª¶¯ÓôóÄÔÆ¤²ãÈ¥ÍÆÀí£¬£¬£¬£¬Ö±½Óͨ¹ý¹þÏ£Ë÷Òý£¬£¬£¬£¬ÔÚÄÚ´æ±íÖÐ " ²é " ³ö¶ÔÓ¦µÄÏòÁ¿¡£¡£¡£¡£ÕâÒ»Àú³ÌµÄʱ¼äÖØÆ¯ºóÊÇO ( 1 ) ¡ª¡ªÕâÒâζ×ÅÎÞÂÛ֪ʶ¿âÅòÕ͵½¶à´ó£¨ÄÄÅÂÊÇ 1000 ÒÚ²ÎÊý£©£¬£¬£¬£¬²éÕÒËÙÂÊÏÕЩÎȹÌ£¬£¬£¬£¬ÇÒ¼«¿ì¡£¡£¡£¡£2. Èý´óÊÖÒÕ»¤³ÇºÓ¼ÈÈ»²é±íÕâôºÃ£¬£¬£¬£¬ÎªÊ²Ã´ÒÔǰûÈË×ö£¿£¿£¿ÓÉÓÚÓÐÈý¸öÀ¹Â·»¢£º´æ´¢±¬Õ¨¡¢¶àÒå´Ê³åÍ»¡¢²ÎÊý·ÖÅä¡£¡£¡£¡£DeepSeek ¸ø³öÁ˽̿ÆÊé¼¶µÄ½â¾ö·½°¸£ºA. ´Ê±íѹËõ£º¼«ÖµÄÈ¥ÖØÌìÏÂÉϵĴÊ×é×éºÏÊÇÌìÎÄÊý×Ö¡£¡£¡£¡£DeepSeek Ê×ÏÈ×öÁËÒ»²½ " ÎÞËðѹËõ "¡£¡£¡£¡£ÔÚ·Ö´ÊÆ÷£¨Tokenizer£©²ãÃæ£¬£¬£¬£¬Ëü½«ÓïÒåÏàͬµ«Ð´·¨²î±ðµÄ´Ê¾ÙÐÐÁ˹éÒ»»¯¡£¡£¡£¡£ÀýÈ磬£¬£¬£¬"Apple"£¨Ê××Öĸ´óд£©ºÍ "apple"£¨Ð¡Ð´£©ÔÚÓïÒåÉÏͨ³£Ö¸Í³Ò»¸ö¹¤¾ß¡£¡£¡£¡£Í¨¹ýÓ³ÉäºÏ²¢£¬£¬£¬£¬ÓÐÓôʱíÖ±½ÓËõСÁË 23%¡£¡£¡£¡£Õâ²»µ«½ÚÔ¼Á˿ռ䣬£¬£¬£¬¸üÈÃ֪ʶµÄÃܶȴó·ùÌáÉý¡£¡£¡£¡£B. ¶àÍ·¹þÏ££º½â¾ö " ¹þÏ£³åÍ» "²»¿ÉÄܰÑËùÓÐ N-gram ¶¼´æÏÂÀ´¡£¡£¡£¡£Engram ʹÓÃÁË " ¶àÍ·¹þÏ££¨Multi-Head Hashing£©" ÊÖÒÕ¡£¡£¡£¡£Í¨¹ý¶à¸ö¹þÏ£º¯Êý£¬£¬£¬£¬½«ÎÞÏÞµÄ N-gram Ó³Éäµ½ÓÐÏÞµÄÄÚ´æ²ÛλÖС£¡£¡£¡£ËäÈ»»áÓйþÏ£³åÍ»£¨¼´Á½¸ö²î±ðµÄ´Ê±»Ó³Éäµ½ÁËͳһ¸öλÖã©£¬£¬£¬£¬µ«Í¨¹ý " ¶àÍ· " Éè¼Æ£¬£¬£¬£¬Ä£×Ó¿ÉÒÔ´Ó¶à¸öºòѡЧ¹ûÖÐÆ´¼¯³ö׼ȷµÄÐÅÏ¢£¬£¬£¬£¬¼«´óµØÌá¸ßÁ˳°ôÐÔ¡£¡£¡£¡£C. ÉÏÏÂÎÄÃſأº¸øÓ°ÏóÅä¸ö " ²ÃÅÐ "ÕâÊÇ×ÃîµÄÒ»±Ê¡£¡£¡£¡£²é±íÊÇËÀµÄ£¬£¬£¬£¬ÓïÑÔÊÇ»îµÄ¡£¡£¡£¡£ºÃ±È " Æ»¹û " Õâ¸ö´Ê¡£¡£¡£¡£ÔÚ " ³ÔÆ»¹û " µÄÓᄈϣ¬£¬£¬£¬Ëüָˮ¹û £»£»£»ÔÚ " Æ»¹ûÐû²¼»á " µÄÓᄈϣ¬£¬£¬£¬ËüÖ¸¿Æ¼¼¹«Ë¾¡£¡£¡£¡£Ö±½Ó²é±í¿ÉÄÜ»áÒýÈëÔëÉù¡£¡£¡£¡£DeepSeek Éè¼ÆÁËÒ»¸ö " ÉÏÏÂÎĸÐÖªÃÅ¿Ø "£¨Context-aware Gating£©¡£¡£¡£¡£Query£¨ÅÌÎÊ£©£ºÄ¿½ñÉÏÏÂÎĵÄÒþ²Ø×´Ì¬£¨Hidden State£©¡£¡£¡£¡£Key/Value£¨¼üÖµ£©£º²é±í»ñµÃµÄ¾²Ì¬ÏòÁ¿¡£¡£¡£¡£Õâ¸öÃſؾÍÏñÒ»¸ö²ÃÅС£¡£¡£¡£ÈôÊDzé³öÀ´µÄ " ¾²Ì¬ÖªÊ¶ " ºÍÄ¿½ñµÄ " ÉÏÏÂÎÄ " ²»´î£¬£¬£¬£¬²ÃÅоͻá°ÑÈ¨ÖØÑ¹µÍ£¨Gate ÖµÇ÷Ïò 0£©£¬£¬£¬£¬ÈÃÄ£×ÓºöÂÔÕâ¸öÔëÉù £»£»£»ÈôÊÇÍêÉÆÆõºÏ£¨ºÃ±È " É˺®ÔÓ²¡ÂÛ " ºóËæ×Å " ÕÅÖÙ¾° "£©£¬£¬£¬£¬²ÃÅоͻá°Ñ´óÃÅ·­¿ª£¨Gate ÖµÇ÷Ïò 1£©£¬£¬£¬£¬Ö±½Ó°Ñ֪ʶעÈëÄ£×Ó¡£¡£¡£¡£ µÚ¶þÕ£º»Æ½ð±ÈÀý¡ª¡ª·¢Ã÷ AI Ä£× "U ÐÍÇúÏß "¼Ü¹¹Éè¼ÆºÃÁË£¬£¬£¬£¬½ÓÏÂÀ´µÄÎÊÌâÊÇ£ºÔõô·Ö¾Ó²ú£¿£¿£¿¼ÙÉèÎÒÃÇÏÔ¿¨ÀïµÄÏÔ´æÊÇÓÐÏ޵쬣¬£¬£¬×ܲÎÊýÔ¤ËãÒ²ÊÇÀο¿µÄ¡£¡£¡£¡£ÎÒÃÇÓ¦¸Ã°Ñ¼¸¶à²ÎÊý·ÖÅ䏸 MoE µÄ " ר¼Ò "£¨ÈÏÕæÅÌË㣩£¬£¬£¬£¬¼¸¶à²ÎÊý·ÖÅ䏸 Engram µÄ " ×Öµä "£¨ÈÏÕæÓ°Ï󣩣¿£¿£¿ÕâÊÇÒ»¸öµä·¶µÄ×ÊÔ´ÉèÖò©ÞÄ¡£¡£¡£¡£DeepSeek ÍŶӾÙÐÐÁËÒ»³¡´ó¹æÄ£µÄÏûÈÚʵÑ飬£¬£¬£¬É¨ÃèÁË´Ó 0% µ½ 100% µÄ·ÖÅä±ÈÀý£¬£¬£¬£¬Ð§¹û»­³öÁËÒ»ÌõÍêÉÆµÄ "U ÐÍ Scaling Law ÇúÏß "¡£¡£¡£¡£ÕâÕÅͼչÏÖÁË AI Ä£×ÓÉè¼ÆµÄµ×²ã¼ÍÂÉ£º×ó²à¼«¶Ë£¨´¿ Engram£©£ºÈôÊǰѲÎÊýÈ«¸ø×ֵ䣬£¬£¬£¬Loss ºÜ¸ß¡£¡£¡£¡£ÓÉÓÚÄ£×ÓÄð³ÉÁË " Êé°×³Õ "£¬£¬£¬£¬¹âÓÐËÀ¼ÇÓ²±³£¬£¬£¬£¬Ã»ÓÐÂß¼­ÍÆÀíÄÜÁ¦¡£¡£¡£¡£ÓҲ༫¶Ë£¨´¿ MoE£©£ºÈôÊǰѲÎÊýÈ«¸ø×¨¼Ò£¬£¬£¬£¬Loss Ò²ºÜ¸ß¡£¡£¡£¡£ÓÉÓÚר¼ÒÃDZ»ÆÈ°Ñ¾«Éñ¶¼»¨ÔÚ±³Ê飨ӰÏó¾²Ì¬ÖªÊ¶£©ÉÏ£¬£¬£¬£¬Ã»¿Õ¸ÉÕýÊ¡£¡£¡£¡ £»£»£»Æ½ðÖ§½âµã£¨¦Ñ ¡Ö 75%-80%£©£ºµ±ÎÒÃǽ«Ô¼20%-25% µÄÏ£º±²ÎÊýÔ¤Ëã·Ö¸ø Engram£¬£¬£¬£¬Ê£Ïµĸø MoE ʱ£¬£¬£¬£¬Ä£×ÓµÄÑéÖ¤¼¯ Loss ½µµ½ÁË×îµÍµã¡£¡£¡£¡£ÕâÊÇÒ»¸ö¼«¾ßÖ¸µ¼ÒâÒåµÄ·¢Ã÷£º¹ØÓÚ¼¸°ÙÒÚ²ÎÊýµÄ´óÄ£×ÓÀ´Ëµ£¬£¬£¬£¬´¿´â¶ÑÆöÅÌË㵥루MoE ר¼Ò£©ÒѾ­ÊDZ߼ÊЧӦµÝ¼õÁË£¬£¬£¬£¬±ØÐèÒýÈëרÃŵľ²Ì¬Ó°ÏóÄ£¿£¿£¿éÀ´ÊµÏÖ " ´æËãÆ½ºâ "¡£¡£¡£¡£ µÚÈýÕ£º·´Ö±¾õµÄ±¬·¢¡ª¡ªÎªÊ²Ã´ " ²é×Öµä " ÄÜÌá¸ß " ÊýѧЧ¹û "£¿£¿£¿ÈôÊÇ Engram ½ö½öÊÇÈÃÄ£×Ó " ¼ÇÐÔ¸üºÃ "£¬£¬£¬£¬ÕâÆªÂÛÎĵķÖÁ¿»¹È±·¦ÒÔÕð¾ªÉçÇø¡£¡£¡£¡£ÊÂʵ£¬£¬£¬£¬RAG£¨¼ìË÷ÔöÇ¿ÌìÉú£©Ò²Äܽâ¾ö֪ʶÎÊÌâ¡£¡£¡£¡£ÕæÕýÈÃÒµ½ç¸ÐÓ¦Õ𺳵Ä£¬£¬£¬£¬ÊÇʵÑéЧ¹ûÖÐÄÇЩÒâÁÏÖ®ÍâµÄÊÕÒæ¡£¡£¡£¡£DeepSeek ¹¹½¨ÁËÈý¸ö±ÈÕÕÄ£×Ó£¬£¬£¬£¬ÑÏ¿á¿ØÖÆ¼¤»î²ÎÊýÄ¿£¨3.8B£©ºÍѵÁ·Êý¾ÝÁ¿£¨262B tokens£©ÍêȫһÖ£ºDense-4B£º¹Å°åµÄŨÃÜÄ£×Ó¡£¡£¡£¡£MoE-27B£º´¿ MoE Ä£×Ó£¨72 ¸öר¼Ò£©¡£¡£¡£¡£Engram-27B£º»ìÏýÄ£×Ó£¨55 ¸öר¼Ò + 5.7B Engram ²ÎÊý£©¡£¡£¡£¡£Ð§¹ûÁîÈË´óµøÑÛ¾µ£º1. ÒâÁÏÖ®ÖУºÖªÊ¶ÀàʹÃü°Ô°ñÔÚ MMLU£¨×ÛºÏ֪ʶ£©ÉÏ£¬£¬£¬£¬Engram Ä£×ÓÌáÉýÁË3.4 ·Ö £»£»£»ÔÚ CMMLU£¨ÖÐÎÄ֪ʶ£©ÉÏ£¬£¬£¬£¬ÌáÉýÁË4.0 ·Ö¡£¡£¡£¡£ÕâºÜºÃÃ÷È·£¬£¬£¬£¬Íâ¹ÒÁË×ֵ䣬£¬£¬£¬ÖªÊ¶×ÔÈ»¸üºÃÁË£¬£¬£¬£¬»Ã¾õ¸üÉÙÁË¡£¡£¡£¡£2. ÒâÁÏÖ®Í⣺Âß¼­¡¢´úÂë¡¢ÊýѧÖÜÈ«±©Õǰ´Àí˵£¬£¬£¬£¬" ²é×Öµä " ºÍ " ×öÊýѧÌâ " û¹ØÏµ¡£¡£¡£¡£µ«ÔÚ BBH£¨×ÛºÏÍÆÀí£©ÉÏ£¬£¬£¬£¬Engram-27B ¾¹È»±Èͬ²ÎÊýµÄ´¿ MoE »ùÏßÌáÉýÁËÕûÕû5.0 ·Ö£¡MATH£¨Êýѧ£©£ºÌáÉý2.4 ·Ö¡£¡£¡£¡£HumanEval£¨´úÂëÌìÉú£©£ºÌáÉý3.0 ·Ö¡£¡£¡£¡£ARC-Challenge£¨ÖØ´óÍÆÀí£©£ºÌáÉý3.7 ·Ö¡£¡£¡£¡£3. Éî¶ÈÆÊÎö£ºÓÐÓÃÉî¶È£¨Effective Depth£©ÀíÂÛΪʲô£¿£¿£¿Ò»¸ö " ËÀ¼ÇÓ²±³ " µÄÄ£¿£¿£¿é£¬£¬£¬£¬ÎªÊ²Ã´ÄÜÌá¸ßÖÇÉÌ£¿£¿£¿DeepSeek ÍŶÓʹÓÃLogitLensºÍ "CKA£¨ÖÐÐÄºË¶ÔÆë£©" ÊÖÒÕ£¬£¬£¬£¬¶ÔÄ£×ÓÄÚ²¿¾ÙÐÐÁË " ÆÊ½â "¡£¡£¡£¡£ËûÃÇ·¢Ã÷ÁËÒ»¸ö¾ªÈ˵ÄÕ÷Ï󣺻¹¼ÇµÃ¿ªÍ·µÄ " ´÷°²ÄÈÍõåú " Â𣿣¿£¿ÔÚ´¿ MoE Ä£×ÓÖУ¬£¬£¬£¬Ç°¼¸²ãÍøÂç¶¼ÔÚæ×Å " Æ´¼¯¿´·¨ "¡£¡£¡£¡£¶øÔÚ Engram Ä£×ÓÖУ¬£¬£¬£¬ÓÉÓÚµÚ 2 ²ã¾Í²åÈëÁË Engram Ä£¿£¿£¿é£¬£¬£¬£¬¾²Ì¬ÖªÊ¶µÄ¼ìË÷ÔÚ¼«ÔçµÄ½×¶Î¾ÍÍê³ÉÁË¡£¡£¡£¡£ÕâÒâζ×Å£¬£¬£¬£¬Ô­±¾ÓÃÓÚ " ËÀ¼ÇÓ²±³ " µÄǰ¼¸²ãÍøÂç±»½â·ÅÁË£¡ÕâÏ൱ÓÚ¸øÄ£×Ó " ÐéÔö " ÁËÉî¶È¡£¡£¡£¡£ ÄÇЩ±»ÊͷųöÀ´µÄÍøÂç²ãºÍ×¢ÖØÁ¦Í·£¨Attention Heads£©£¬£¬£¬£¬²»ÔÙÐèÒª´¦ÀíààËյľֲ¿ÒÀÀµ£¨ºÃ±Èʶ±ð " ÕÅÖÙ¾° " ÊÇË­£©£¬£¬£¬£¬´Ó¶ø¿ÉÒÔÈ«Éñ¹á×¢µØÍ¶Èëµ½¸üÖØ´óµÄÈ«¾ÖÍÆÀí¡¢³¤³ÌÂß¼­¹¹½¨ºÍ´úÂëÂß¼­ÌìÉúÖÐÈ¥¡£¡£¡£¡£Engram µÄʵÖÊ£¬£¬£¬£¬²»ÊÇ " Ìæ»» " ÍÆÀí£¬£¬£¬£¬¶øÊÇͨ¹ý " ·ÖÁ÷ " Ôӻ£¬£¬£¬ÈôóÄÔרעÓÚ¸ü¸ßά¶ÈµÄ˼Ë÷¡£¡£¡£¡£ µÚËÄÕ£º¹¤³ÌÊÂÒµ¡ª¡ªÍ»ÆÆÓ¢Î°´ïµÄ " ÏÔ´æ°ÔȨ "¹ØÓÚ»ª¶û½ÖµÄͶ×ÊÕߺÍËãÁ¦ÖÐÐĵÄÔËάÕßÀ´Ëµ£¬£¬£¬£¬ÕâÆªÂÛÎÄ×îÐԸеĵط½²»ÔÚÓÚ Score£¬£¬£¬£¬¶øÔÚÓÚCost£¨±¾Ç®£©¡£¡£¡£¡£ÔÚ AI ʱ´ú£¬£¬£¬£¬×îÌÚ¹óµÄ×ÊÔ´²»ÊÇËãÁ¦£¨FLOPs£©£¬£¬£¬£¬¶øÊÇÏԴ棨HBM£©¡£¡£¡£¡£Ó¢Î°´ï H100 Ö®ÒÔÊǹ󣬣¬£¬£¬ºÜºéÁ÷ƽÉÏÊÇÓÉÓÚÄÇϡȱµÄ HBM3e ÄÚ´æ¡£¡£¡£¡£¶ø Engram ´øÀ´ÁËÒ»¸öÇ㸲ÐÔµÄÌØÕ÷£º³¹µ×µÄ´æËãÊèÉ¢¡£¡£¡£¡£1. MoE µÄÍ´µã£ºÏÔ´æÍÌÊÉÕ߹ŰåµÄ MoE Ä£×Ó£¬£¬£¬£¬Æä·ÓÉ»úÖÆ£¨Routing£©ÊǶ¯Ì¬µÄ¡£¡£¡£¡£Ä£×Ó±ØÐèÏÈËã³öÄ¿½ñ Token µÄÌØÕ÷£¬£¬£¬£¬ËãÍêÕâÒ»²ã£¬£¬£¬£¬²ÅÖªµÀÏÂÒ»²ã¸ÃÕÒÄĸöר¼Ò¡£¡£¡£¡£ÕâÒâζ×Å£¬£¬£¬£¬ËùÓеÄר¼ÒÄ£×Ó±ØÐèʱ¿ÌÔÚÌÚ¹óµÄ GPU ÏÔ´æÀï´ýÃü£¬£¬£¬£¬Ëæ½ÐËæµ½¡£¡£¡£¡£2. Engram µÄÍ»ÆÆ£ºÈ·¶¨µÄÔ¤ÖªEngram µÄ²é±íÂß¼­ÊÇÈ·¶¨ÐԵġ£¡£¡£¡£Ö»ÒªÊäÈëµÄÎı¾È·¶¨ÁË£¨ºÃ±È "A New Axis of Sparsity"£©£¬£¬£¬£¬ÄÇôËü¶ÔÓ¦µÄ N-gram Ë÷Òý¾ÍÈ·¶¨ÁË¡£¡£¡£¡£ÎÒÃÇ»ù´¡²»ÐèÒªµÈÄ£×ÓËãÍêǰһ²ã£¬£¬£¬£¬ÔÚ Token ½øÈëÄ£×ÓµÄÄÇһ˲¼ä£¬£¬£¬£¬ÎÒÃǾÍÖªµÀËüÐèÒª²éÄÄÕűíµÄÄÄÒ»ÐС£¡£¡£¡£3. CPU µÄÄæÏ®£º°Ñ´óÄ£×ÓÈû½øÄÚ´æÌõÕâÒ»ÌØÕ÷´øÀ´ÁËÖØ´óµÄ¹¤³ÌÓ¯Àû£ºÐ¶ÔØ£¨Offload£©£ºÎÒÃÇ¿ÉÒ԰Ѽ¸°ÙÒÚ¡¢ÉõÖÁÉÏǧÒÚ²ÎÊýµÄ Engram ´Ê±í£¬£¬£¬£¬Ö±½ÓÈÓµ½×ÔÖÆ¡¢Á¿´ó¡¢Ò×À©Õ¹µÄ "CPU Äڴ棨DRAM£©" À£¬£¬£¬ÉõÖÁ·ÅÔÚ NVMe SSD ÉÏ¡£¡£¡£¡£Ô¤È¡£¡£¡£¡£¨Prefetching£©£ºÔÚ GPU Æ´ÃüÅÌËãǰһ²ã Transformer µÄʱ¼ä£¬£¬£¬£¬CPU ʹÓà PCIe ͨµÀ£¬£¬£¬£¬Òì²½µØ°ÑÏÂÒ»²ãÐèÒªµÄÓ°ÏóÊý¾Ý " Ԥȡ " ³öÀ´£¬£¬£¬£¬ÍÆË͵½ GPU¡£¡£¡£¡£ÑÚÊÎÑÓ³Ù£¬£¬£¬£¬²¢Ðд¦Àí¡£¡£¡£¡£DeepSeek ʵ²âÊý¾ÝÏÔʾ£º×ÝÈ»¹ÒÔØÁË100B£¨Ç§ÒÚ£©²ÎÊýµÄ Engram ±íµ½ CPU Äڴ棬£¬£¬£¬Ïà±ÈÓÚ´¿ GPU ÍÆÀí£¬£¬£¬£¬ÍÌÍÂÁ¿µÄϽµ²»µ½ 3%¡£¡£¡£¡£ÕâÊÇÒ»¸öÈÃËùÓÐÓÉÓÚÂò²»µ½ HBM ¶ø½¹ÂǵÄÈË¿ñϲµÄ½áÂÛ¡£¡£¡£¡£ÕâÒâζ×Å£¬£¬£¬£¬Î´À´µÄ´óÄ£×Ó£¬£¬£¬£¬" Ó°ÏóÈÝÁ¿ " ¿ÉÒԵͳÉÍâµØÎÞÏÞÀ©ÕÅ£¬£¬£¬£¬¶ø²»±Ø±»Ó¢Î°´ïµÄÏԴ濨²±×Ó¡£¡£¡£¡£ µÚÎåÕ£º³¤Îı¾µÄʤÀû¡ª¡ª NIAH ²âÊÔµÄÔ¾Éý³ýÁËͨÓÃÍÆÀí£¬£¬£¬£¬Engram ÔÚ³¤Îı¾£¨Long Context£©ÁìÓòµÄÌåÏÖͬÑù֤ʵÎú " ·Ö¹¤ " µÄ¼ÛÖµ¡£¡£¡£¡£ÔÚ³¤Îı¾´¦ÀíÖУ¬£¬£¬£¬×¢ÖØÁ¦»úÖÆ£¨Attention£©µÄ´°¿ÚÊÇÓÐÏ޵ġ£¡£¡£¡£ÈôÊÇ×¢ÖØÁ¦±»´ó×ڵľֲ¿ÐÅÏ¢£¨ÈçÀο¿¶Ì

      ÏÂÔØ

    • ÒÁÀÊÍⳤ£ºÕþ¸®ÒÑÓ뿹ÒéÕßÕö¿ª¶Ô»°Ê±ÊÆÏÖÔÚÍêÈ«»ñµÃ¿ØÖÆ

      ¼¯Ò¸³Éô㺵¤ÂóÊÔÉäÒÔÉ«Áп¨³µÅÚ·¨¹úÆï±ø²½¶Ó¼ÓÈëʵս·´¿¹ÑÝϰ

      ÔøÖ¾Î°ÔÆÄÏÓ÷¹£¬£¬£¬£¬´©¿íËÉÎÀÒÂÅäÀ«ÍÈ¿ãºÜÐÝÏÐ

      ÏÂÔØ

    • ÑÇÂíÑ·µÄ»¹»÷£ºGWDÕóÈÝ´ó¡¢Â䵨ÄÑ

      ÔøÖ¾Î°ÔÆÄÏÓ÷¹£¬£¬£¬£¬´©¿íËÉÎÀÒÂÅäÀ«ÍÈ¿ãºÜÐÝÏÐ

      ÖйúÒ²µÃѧ£ºÎÚ¿ËÀ¼½ÌÓýÈ«Çòˮʦ£¬£¬£¬£¬Öƺ£È¨»¹¿ÉÒÔÕâô»ñÈ¡

      ÏÂÔØ

    • Õż̿ƣºÍõºÆÐíê¿Íõð©Ë­×îÄÑ´ò£¿£¿£¿¶¼Äѵ«Îҿ϶¨ÊÇËûÃÇÐÄÖÐ×îÄѵÄ

      Æ´¶à¶àÀ뼴ʱÁãÊÛ¸ü½üÁË

      ×ÊÖÎͨ¼ø£ºÏÂÊô½ÏÁ¿ÏùÕÅ£¬£¬£¬£¬ÔõôӦ¶Ô£¿£¿£¿ººÎäµÛ¸æËßÎÒÃÇ¡°ÇôòËû¡±£¡

      ÏÂÔØ

    • ίÍⳤ£ºÎ¯ÄÚÈðÀ­ÓëÒâ´óÀû½«ÌáÉýÍâ½»¹ØÏµ²¢»¥ÅÉ´óʹ

      ¶¼Îó»áÕÔ½ñÂóÁË£¬£¬£¬£¬ËýÒ»µãÒ²²»¡°ÍÁÆø¡±

      ÁºÎÄ·æÊðÃûÐÂÂÛÎÄ£º¸ø´óÄ£×ÓÅä±¾¡°×ֵ䡱£¬£¬£¬£¬ÅÌËã¡¢Ó°Ïó·Ö¾ÓºóÖÇÉ̱¬±í£¬£¬£¬£¬¾ç͸DeepSeekV4£¿£¿£¿

      ÏÂÔØ

    • ÆÕ¾©Èô±»°ó¼Ü£¬£¬£¬£¬ÃÀ×ÜÍ³ÌØÀÊÆÕ»ØÓ¦£¬£¬£¬£¬¶íÂÞ˹ÊÕµ½Ã÷È·ÖÒÑÔ
    • Ó¢¹úͨѶÖÎÀí¾Ö¶ÔXƽ̨Õö¿ªÊÓ²ì

      AloÏë×ölululemom£¬£¬£¬£¬µ«¸üÏë×ömiumiu

      ±ÈÑǵÏÐÂÆ·ÅÆ¡°Áì»ã¡±À´ÁË£¬£¬£¬£¬4¿î³µÐÍÆØ¹â£¡ÖªÇéÈËÊ¿£º×¨¹©´óÅúÁ¿²É¹ºÐèÇó

      ÏÂÔØ

    • Ó¢¹úͨѶÖÎÀí¾Ö¶ÔXƽ̨Õö¿ªÊÓ²ì

      ÏëÂò¸ñÁêÀ¼£¬£¬£¬£¬ÌØÀÊÆÕÓÖÄÃÖжí˵Ê£¬£¬£¬£¬Ôⱱŷ¹ú¼Ò¡°´òÁ³¡±

      ãÆÑ§¾§Î´½Óµ½2026ÄêÑëÊÓ´ºÍíÑûÔ¼£¬£¬£¬£¬Ò²Ã»Óнӵ½ÁÉÄþ´ºÍíÑûÔ¼

      ÏÂÔØ

    • 2026´óÄ£×ÓÈüµÀÐÂÄêÆôʾ£ºÔÚ20%µÄʤÂÊÏ£¬£¬£¬£¬±¿±¿µØ¼á³Ö

      Ī˹¿Æ¸÷´ó»ú³¡ÓÖ·ºÆð´ó¹æÄ£º½°àÑÓÎ󣬣¬£¬£¬µ«Õâ´Î²»ÊÇÓÉÓÚÎÞÈË»úÏ®»÷ÁË

      ÊÕ¹º²»¿ÉÍ££¡ÍâÑóÊÕ¹ºÃÎË飬£¬£¬£¬µç½âÍ­²­ÁúÍ·±»ÆÈ¡°½µ¼¶¡±º£ÄÚ²úÄܲ¢¹º

      ÏÂÔØ

    • ¸ßÊÐ×îÏÈÄź°¡°²»¿É½ÓÊÜ¡±£ºÈÕ±¾µÄ¶Ô»ª¡°Õ½ÕùðÏÕ¡±µÄ¿ÉÄÜÐÔ

      ÔóÁ¬Ë¹»ùÒªÇóÅ·ÖÞÌṩ¸ü¶à·À¿Õµ¼µ¯

      °ÍÁÐά֮×ÓÕкô¿¹Ò飬£¬£¬£¬ÒÁÀÊÍⳤ»ØÓ¦£ºÍâ²¿ÊÆÁ¦ÎÞȨ¸ÉÔ¤ÄÚÕþ

      ÏÂÔØ

    • ÁºÎÄ·æÊðÃûÐÂÂÛÎÄ£º¸ø´óÄ£×ÓÅä±¾¡°×ֵ䡱£¬£¬£¬£¬ÅÌËã¡¢Ó°Ïó·Ö¾ÓºóÖÇÉ̱¬±í£¬£¬£¬£¬¾ç͸DeepSeekV4£¿£¿£¿

      ¶íÍâ½»²¿£º¡°°ó¼Ü¡±ÆÕ¾©ÊdzÕÈË˵ÃÎ

      Ó¢Íõ²é¶û˹¾Ü¼û¹þÀïÍõ×Ó£¬£¬£¬£¬Ã·¸ù±»ÆØ¼µ¶Ê¹þÀïÓëÕ²Äݸ¥¡¤ÂåÅå×Ƚ»Á÷

      ÏÂÔØ

    • 2026¿¼Ñб¨¿¼ÈÈßßУTOP10³ö¯
    • 46Äêºó£¬£¬£¬£¬Öйú¹ÛÖÚÓ­À´¿Ö²ÀÕæÉñ
    • Ó¢Íõ²é¶û˹¾Ü¼û¹þÀïÍõ×Ó£¬£¬£¬£¬Ã·¸ù±»ÆØ¼µ¶Ê¹þÀïÓëÕ²Äݸ¥¡¤ÂåÅå×Ƚ»Á÷

      Å®×ðÓÖ×îÏÈÊ¢ÐÐÁË£¿£¿£¿

      2026¿¼Ñб¨¿¼ÈÈßßУTOP10³ö¯

      ÏÂÔØ

    • ÖÂδÀ´¿Æ´´Ê×ÄÔ£¬£¬£¬£¬¸Û¿Æ´ó(¹ãÖÝ£©MBA+
    • ãÆÑ§¾§µÄÖÂǸÐÅдµÃºÜºÃ£¬£¬£¬£¬µ«±£´æÒ»¸öÖÂÃüÎÊÌâ

      Öз½Íƶ¯½ðשÇå¾²ÏàÖú£¬£¬£¬£¬¶à¹ú¼ÓÈë²ÎÑÝ£¬£¬£¬£¬Ò»¹úÃ÷È·¾Ü¾ø

      ËÙ¿´£¡»ÆÈÊÑ«CES2026Ñݽ²Íò×Öʵ¼£ºË¦³ö¡°ÎïÀíAI¡±ÍõÅÆ

      ÏÂÔØ

    • ºÏ´¨Ç§ÈËɱÖíÑçºóÐø£¬£¬£¬£¬¶àÍøºìÔÒ³¡×Ó£¬£¬£¬£¬ÂúԺɢÂÒÎÞÈËÁÏÀí£¬£¬£¬£¬ÎÄÂÃϳ¡
    • ÃåµéÃæÁÙ±»Ö«½â£¬£¬£¬£¬¡°¸ç¶¼Àñ¹²ºÍ¹ú¡±½µÉú£¬£¬£¬£¬ÁªºÏ¹úÓÖ¶àÒ»¹ú£¿£¿£¿
    • ¹ú²úƬ£¬£¬£¬£¬ÕâÏÂÕæÐа¡

      °¢ÐÅÑݳª»áÓÖˤÁË£¡Î⽨ºÀѸËÙ½«ÆäÀ­Æð£¬£¬£¬£¬´ËǰÔÚÉϺ£³¡ÔøË¤ÏÂÎę̀

      ÎÚ¿ËÀ¼¾ü·½ÏÖÒÛ×î´«ÆæµÄ2B16ÆÈÁñÅÚµ±¹ý20ÄêËêÄî±®µÄËÕÁªÀϹǶ­

      ÏÂÔØ

    • ÌØÀÊÆÕÓë¹Å°Í̸ÅÐǰϦ£º¡°ÎÞÂÛÊÇʯÓÍÕÕ¾É×ʽ𣬣¬£¬£¬¶¼½«²»ÔÙÁ÷Ïò¹Å°Í¡±

    ±êÇ©Áбí

    ×îÐÂÁôÑÔ

    ÈÈÃÅÊÖÓÎ

    ×ܽáÈ«Íø259ƪЧ¹û

    ´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð

    • Öֱ𣺠ÉúÑÄ·þÎñ
    • ´óС£¡£¡£¡£º 489.269MB
    • ϵͳ£º Android
    • ¸üУº 2026-01-13 23:15
    • ÈËÆø£º 24999
    • ̸ÂÛ£º 1947
    °²×¿ÏÂÔØ

    Ó¦ÓýéÉÜ

    • ´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð
    • ´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð
    • ´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð
    °Ù¶È°ü¹Ü£¬£¬£¬£¬ÎªÄúËÑË÷»¤º½wAAAABJRU5ErkJggg==

    ×î¼Ñ»Ø¸²

    1¡¢´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð?¡ª¡ªTG:Ôݲ»¹ûÕæ¡ª¡ª?????????????????????????

    2¡¢¸ÓÖÝвèÄÛ²èwx?¡ª¡ªTG:Ôݲ»¹ûÕæ¡ª¡ª??????????????????????????

    3¡¢?ÖØ°õÐÂÎÅÀ´Ï®£¡??´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð-APPÏÂÔØ?Ö§³Ö:winall/win7/win10/win11?ϵͳÀàÐÍ?:´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð(2025ȫվ)×îа汾IOS/°²×¿¹Ù·½Èë¿ÚN.19.90.78(Ç徲ƽ̨)

    4¡¢?¶À¼Ò£¡???´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð-APPÏÂÔØ?Ö§³Ö:winall/win7/win10/win11?ϵͳÀàÐÍ?:´óѧÉú¸ß¶ËÉÌÎñÕæµÄÂð(2025ȫվ)×îа汾IOS/°²×¿¹Ù·½Èë¿ÚN.7.71.23(Ç徲ƽ̨)

    u=3967683756,165445082&fm=30&app=106&f=JPEG?w=312&h=208&s=1B01AC4EC61A61DE8223C03D03001059 u=3941562296,165454567&fm=30&app=106&f=JPEG?w=312&h=208&s=75BBAD771F20772ECEE5F144030060B1 u=224571834,165455031&fm=30&app=106&f=JPEG?w=312&h=208&s=53383EC40C53A1C24A82482D0300E05B

    Ö©Öë³ØÖеÄ302Ìø×ªÊ¹Óù淶

    ×÷Ϊһ¸öרҵµÄSEOÐÐÒµÕ¾³¤£¬£¬£¬£¬Ïàʶ²¢ÕÆÎÕÖ©Öë³Ø³ÌÐòµÄÔ­ÀíºÍÓÃ;ÊǺÜÊÇÖ÷ÒªµÄ¡£¡£¡£¡£Ö©Öë³ØÊÇÒ»ÖÖÓÃÓÚÄ£ÄâËÑË÷ÒýÇæÖ©Ö루spider£©ÅÀÈ¡ÍøÒ³µÄ¹¤¾ß£¬£¬£¬£¬Ëü¿ÉÒÔÄ£Äâ¶à¸öÖ©Öëͬʱ·ÃÎÊÍøÕ¾£¬£¬£¬£¬²¢ÍøÂçÍøÕ¾ÉϵÄÐÅÏ¢¡£¡£¡£¡£ÔÚSEOÓÅ»¯µÈÁìÓò£¬£¬£¬£¬Ö©Öë³Ø³ÌÐò¿ÉÒÔ×ÊÖúÕ¾³¤¸üºÃµØÏàʶËÑË÷ÒýÇæ¶ÔÍøÕ¾µÄ»á¼ûÇéÐΣ¬£¬£¬£¬´Ó¶ø×ö³öÏìÓ¦µÄÓÅ»¯¡£¡£¡£¡£

    Ö©Öë³Ø³ÌÐòµÄÔ­Àí

    Ö©Öë³Ø³ÌÐòµÄÔ­ÀíÖ÷ÒªÊÇͨ¹ýÄ£Äâ¶à¸öÖ©Öëͬʱ·ÃÎÊÍøÕ¾£¬£¬£¬£¬ÍøÂçÍøÕ¾ÉϵÄÐÅÏ¢¡£¡£¡£¡£ÔÚÏÖʵ²Ù×÷ÖУ¬£¬£¬£¬Õ¾³¤¿ÉÒÔÉèÖÃÖ©Öë³Ø³ÌÐòÄ£Äâ²î±ðËÑË÷ÒýÇæµÄÖ©Ö룬£¬£¬£¬ºÃ±ÈGoogle¡¢BingµÈ£¬£¬£¬£¬ÒÔ´ËÀ´Ïàʶ²î±ðËÑË÷ÒýÇæ¶ÔÍøÕ¾µÄ»á¼ûÇé¿ö¡£¡£¡£¡£Í¨¹ýÖ©Öë³Ø³ÌÐòÍøÂçµ½µÄÊý¾Ý£¬£¬£¬£¬Õ¾³¤¿ÉÒÔÆÊÎöÍøÕ¾ÔÚËÑË÷ÒýÇæÖеÄÅÅÃûÇéÐΡ¢ÍøÒ³±»Ë÷ÒýµÄÇéÐεÈ£¬£¬£¬£¬´Ó¶ø¸üºÃµØ¾ÙÐÐSEOÓÅ»¯¡£¡£¡£¡£

    Ö©Öë³Ø³ÌÐòµÄÓÃ;

    Ö©Öë³Ø³ÌÐòÔÚSEOÓÅ»¯ÖÐÓÐ×ÅÆÕ±éµÄÓÃ;¡£¡£¡£¡£Ê×ÏÈ£¬£¬£¬£¬Í¨¹ýÖ©Öë³Ø³ÌÐò¿ÉÒÔÊÓ²ìËÑË÷ÒýÇæÖ©Öë¶ÔÍøÕ¾µÄ»á¼ûÇéÐΣ¬£¬£¬£¬****ÏÖÍøÕ¾±»ÆÁ±Î»ò±»½µÈ¨µÄÇéÐΡ£¡£¡£¡£Æä´Î£¬£¬£¬£¬Ö©Öë³Ø³ÌÐò¿ÉÒÔ¼à¿ØÍøÕ¾µÄË÷ÒýÇéÐΣ¬£¬£¬£¬****ÏÖÄÄЩÒ³ÃæÎ´±»Ë÷Òý»ò±»ÒÅ©¡£¡£¡£¡£×îºó£¬£¬£¬£¬Ö©Öë³Ø³ÌÐò»¹¿ÉÒÔ¸ú×ÙÍøÕ¾Òªº¦´ÊµÄÅÅÃûÇéÐΣ¬£¬£¬£¬ÊµÊ±µ÷ÕûÓÅ»¯Õ½ÂÔ¡£¡£¡£¡£

    ×îºó

    ÕâÊÇÒ»¸ö¹ØÓÚ AI µ×²ãÂß¼­Öع¹µÄʱ¿Ì¡£¡£¡£¡£ºã¾ÃÒÔÀ´£¬£¬£¬£¬Transformer ¼Ü¹¹±»À§ÔÚÒ»¸öÌÚ¹óµÄã£ÂÛÖУºÎÒÃÇÓÃ×Å×îÏȽøµÄ GPU ËãÁ¦£¬£¬£¬£¬È¥Èà AI Ä£×Ó " ËÀ¼ÇÓ²±³ " ÄÇЩ²é×Öµä¾ÍÄÜÖªµÀµÄ¾²Ì¬ÖªÊ¶¡£¡£¡£¡£DeepSeek ÁºÎÄ·æÍŶÓÓëÆä±±´óÏàÖúÕßÔÚ½ñÈÕÆÆÏþÐû²¼µÄÖØ°õÂÛÎÄ¡¶Conditional Memory via Scalable Lookup¡·£¬£¬£¬£¬³¹µ×Í»ÆÆÁËÕâÒ»½©¾Ö¡£¡£¡£¡£ËûÃÇÌá³öÁËÒ»ÖÖȫеÄEngram£¨Ó¡¼££©Ä£¿£¿£¿é£¬£¬£¬£¬ÔڹŰåµÄ " Ìõ¼þÅÌËã "£¨MoE£©Ö®Í⣬£¬£¬£¬¿ª·¢Á˵ڶþÌõÏ£º±»¯Õ½Ïß¡ª¡ª" Ìõ¼þÓ°Ïó "¡£¡£¡£¡£Õâ²»µ«ÊÇÒ»´ÎÊÖÒÕÐÞ²¹£¬£¬£¬£¬¶øÊÇÒ»³¡¹ØÓÚÄ£×Ó " ÄÔÈÝÁ¿ " µÄ¹©Ó¦²àˢС£¡£¡£¡£Ëü֤ʵÎú£ºµ±ÎÒÃǽ« " Ó°Ïó " ´Ó " ÅÌËã " ÖаþÀ룬£¬£¬£¬°Ñ¸Ã±³µÄ½»¸ø " ×Öµä "£¬£¬£¬£¬°Ñ¸ÃËãµÄ½»¸ø´óÄÔ£¬£¬£¬£¬AI µÄÍÆÀíÄÜÁ¦½«Ó­À´·´Ö±¾õµÄ±¬·¢Ê½ÔöÌí¡£¡£¡£¡£DeepSeek ÍýÏëÔÚ 2 Ô´º½ÚǰºóÕýʽÐû²¼ V4£¬£¬£¬£¬¶øÕâÒ»¿Ì»òÐí¾ÍÊÇ DeepSeek V4 ½µÉúµÄǰҹ¡£¡£¡£¡£ ÐòÕ£ºÁù²ãÉñ¾­ÍøÂçµÄ " ÎÞÓù¦ "¹ÊÊÂµÄÆðµã£¬£¬£¬£¬Ô´ÓÚ DeepSeek ÍÅ¶Ó¶Ô Transformer ÄÚ²¿ÔË×÷»úÖÆµÄÒ»´Î " ºË´Å¹²Õñ " ɨÃè¡£¡£¡£¡£ÔÚÈ˹¤ÖÇÄܵĺںÐ×ÓÀ£¬£¬£¬µ±´óÄ£×Ó¿´µ½ "Diana, Princess of Wales"£¨´÷°²ÄÈ£¬£¬£¬£¬Íþ¶ûÊ¿Íõåú£©Õâ¸ö¶ÌÓïʱ£¬£¬£¬£¬ËüµÄÄÚ²¿±¬·¢ÁËÒ»³¡ÁîÈ˷ѽâÇÒ¼«ÆäÌÚ¹óµÄ " ÄÚÚ§ "¡£¡£¡£¡£Ñо¿Ö°Ô±·¢Ã÷£¬£¬£¬£¬ÎªÁËʶ±ðÕâ¸öÀο¿µÄʵÌ壬£¬£¬£¬Ä£×Ó¾¹È»¶¯ÓÃÁËÕûÕû 6 ²ãÍøÂ磺µÚ 1-2 ²ã£ºÄ£×Ó»¹ÔÚ×ÁÄ¥ "Wales" »òÐíÊÇÒ»¸ö¹ú¼Ò £»£»£»µÚ 3 ²ã£ºËüÒâʶµ½ÕâÊÇÅ·ÖÞµÄÒ»¸öµØÀí¿´·¨ £»£»£»µÚ 4 ²ã£ºËü×îÏÈÆ´¼¯³ö "Princess of Wales" ËÆºõÊÇÒ»¸öÍ·ÏÎ £»£»£»µÚ 5 ²ã£ºËüåÚÏëµ½ÁË " Íþ¶ûÊ¿Ç×ÍõµÄÆÞ×Ó " £»£»£»µÚ 6 ²ã£ºÖ±µ½ÕâÀ£¬£¬£¬Ëü²ÅÖÕÓÚÈ·ÈÏ£¬£¬£¬£¬ÕâÊÇÖ¸ÄÇÎ»ÖøÃûµÄ " ´÷°²ÄÈÍõåú "¡£¡£¡£¡£ÔÚһλ׷Çó¼«ÖÂЧÂʵļܹ¹Ê¦ÑÛÖУ¬£¬£¬£¬Õâ¼òÖ±ÊÇËãÁ¦µÄ±©éåÌìÎï¡£¡£¡£¡£" ´÷°²ÄÈÍõåú " ÊÇÒ»¸ö¿Í¹Û±£´æµÄ¡¢¾²Ì¬µÄʵÌ壬£¬£¬£¬Ëü²»»áÓÉÓÚÉÏÏÂÎĵÄת±ä¶ø¸Ä±äÆäʵÖÊ¡£¡£¡£¡£ÎªÁËÌáÈ¡Õâ¸öÔ­À´²é×Öµä¾ÍÄÜÖªµÀµÄÊÂʵ£¬£¬£¬£¬Transformer ¾¹È»¶¯ÓÃÁËÕûÕû 6 ²ãÉî¶ÈµÄÌÚ¹ó¾ØÕóÔËËãÈ¥ " ÖØÐÞ " Õâ¸ö¿´·¨¡£¡£¡£¡£Õâ¾ÍÏñÊÇÒ»¸ö¾øÊÀÌì²Å£¬£¬£¬£¬ÔÚÈ¥½â¾ö΢»ý·ÖÄÑÌâ֮ǰ£¬£¬£¬£¬Ã¿´Î¶¼µÃÏÈ»¨°ëСʱĬдһ±é¾Å¾Å³Ë·¨±í¡£¡£¡£¡£ ÕâÖÖ " ÒþʽӰÏó " µÄ»úÖÆ£¬£¬£¬£¬ÆÈʹģ×Ó½«Ãû¹óµÄ²ÎÊýÈÝÁ¿ºÍÍøÂçÉî¶È£¬£¬£¬£¬ÆÌÕÅÔÚÁ˼òÆÓµÄģʽƥÅäÉÏ¡£¡£¡£¡£DeepSeek ÔÚÕâÆª³¤´ï 33 Ò³µÄÂÛÎÄÖУ¬£¬£¬£¬Ìá³öÁËÒ»¸öÖ±»÷Áé»êµÄ¿½ÎÊ£ºÎªÊ²Ã´²»Ö±½Ó¸ø´óÄ£×ÓÅäÒ»±¾¿ÉÒÔËæ²éËæÓÃµÄ " ³¬µÈ×Öµä "£¿£¿£¿ µÚÒ»Õ£º¼Ü¹¹ÖØËÜ¡ª¡ª Engram Ä£¿£¿£¿éµÄ±©Á¦ÃÀѧΪÏàʶ¾öÕâ¸öÎÊÌ⣬£¬£¬£¬DeepSeek Ìá³öÁËÒ»ÖÖÃûΪ "Engram£¨Ìõ¼þÓ°Ïó£©" µÄÈ«ÐÂÄ£¿£¿£¿é¡£¡£¡£¡£ÈôÊÇ˵ MoE£¨»ìÏýר¼ÒÄ£×Ó£©ÊÇ°Ñ " ´óÄÔ " ·Ö³ÉÁ˲î±ðµÄÇøÓò£¬£¬£¬£¬Èòî±ðµÄר¼ÒÈÏÕæ²î±ðµÄ˼Ë÷£¨Ìõ¼þÅÌË㣩 £»£»£»ÄÇô Engram ¾ÍÊǸø´óÄÔÍâ¹ÒÁËÒ»¸öÖØ´óµÄ " º£ÂíÌå "£¬£¬£¬£¬×¨ÃÅÈÏÕæ´æ´¢¾²Ì¬ÖªÊ¶£¨Ìõ¼þÓ°Ï󣩡£¡£¡£¡£1. ¸´Éú "N-gram"£º´Ó¹ÅÀÏÖÇ»ÛÖÐѰÕÒÃÕµ×Engram µÄ½¹µãÁé¸Ð£¬£¬£¬£¬¾¹È»À´×ÔÓÚ NLP£¨×ÔÈ»ÓïÑÔ´¦Àí£©ÁìÓòµÄ " ÉϹÅÉñÆ÷ " ¡ª¡ª N-gram¡£¡£¡£¡£ÔÚÉî¶ÈѧϰͳÖÎÌìÏÂ֮ǰ£¬£¬£¬£¬ÎÒÃǾÍÊÇ¿¿Í³¼Æ "N ¸ö´Êͬʱ·ºÆðµÄ¸ÅÂÊ " À´Ã÷È·ÓïÑԵġ£¡£¡£¡£DeepSeek ½«ÕâÒ»¾­µä¿´·¨¾ÙÐÐÁËÏÖ´ú»¯µÄħ¸Ä£º¹Å°åµÄ Transformer£ºÖªÊ¶ÊèÉ¢ÔÚÉñ¾­ÔªµÄÈ¨ÖØ£¨Weights£©À£¬£¬£¬Ìáȡ֪ʶÐèÒª¾­ÓÉÖØ´óµÄÏßÐÔ²ãÅÌË㣬£¬£¬£¬ÖØÆ¯ºó¸ß¡£¡£¡£¡£Engram Ä£¿£¿£¿é£ºËüÊÇÒ»¸öÖØ´óµÄ¡¢¿ÉÀ©Õ¹µÄǶÈë±í£¨Embedding Table£©¡£¡£¡£¡£µ±Ä£×Ó¶Áµ½ " ÕÅÖÙ¾° " »òÕß " ËÄ´ó·¢Ã÷ " ÕâÖÖÀο¿´îÅ䣨N-gram£©Ê±£¬£¬£¬£¬²»ÐèÒª¶¯ÓôóÄÔÆ¤²ãÈ¥ÍÆÀí£¬£¬£¬£¬Ö±½Óͨ¹ý¹þÏ£Ë÷Òý£¬£¬£¬£¬ÔÚÄÚ´æ±íÖÐ " ²é " ³ö¶ÔÓ¦µÄÏòÁ¿¡£¡£¡£¡£ÕâÒ»Àú³ÌµÄʱ¼äÖØÆ¯ºóÊÇO ( 1 ) ¡ª¡ªÕâÒâζ×ÅÎÞÂÛ֪ʶ¿âÅòÕ͵½¶à´ó£¨ÄÄÅÂÊÇ 1000 ÒÚ²ÎÊý£©£¬£¬£¬£¬²éÕÒËÙÂÊÏÕЩÎȹÌ£¬£¬£¬£¬ÇÒ¼«¿ì¡£¡£¡£¡£2. Èý´óÊÖÒÕ»¤³ÇºÓ¼ÈÈ»²é±íÕâôºÃ£¬£¬£¬£¬ÎªÊ²Ã´ÒÔǰûÈË×ö£¿£¿£¿ÓÉÓÚÓÐÈý¸öÀ¹Â·»¢£º´æ´¢±¬Õ¨¡¢¶àÒå´Ê³åÍ»¡¢²ÎÊý·ÖÅä¡£¡£¡£¡£DeepSeek ¸ø³öÁ˽̿ÆÊé¼¶µÄ½â¾ö·½°¸£ºA. ´Ê±íѹËõ£º¼«ÖµÄÈ¥ÖØÌìÏÂÉϵĴÊ×é×éºÏÊÇÌìÎÄÊý×Ö¡£¡£¡£¡£DeepSeek Ê×ÏÈ×öÁËÒ»²½ " ÎÞËðѹËõ "¡£¡£¡£¡£ÔÚ·Ö´ÊÆ÷£¨Tokenizer£©²ãÃæ£¬£¬£¬£¬Ëü½«ÓïÒåÏàͬµ«Ð´·¨²î±ðµÄ´Ê¾ÙÐÐÁ˹éÒ»»¯¡£¡£¡£¡£ÀýÈ磬£¬£¬£¬"Apple"£¨Ê××Öĸ´óд£©ºÍ "apple"£¨Ð¡Ð´£©ÔÚÓïÒåÉÏͨ³£Ö¸Í³Ò»¸ö¹¤¾ß¡£¡£¡£¡£Í¨¹ýÓ³ÉäºÏ²¢£¬£¬£¬£¬ÓÐÓôʱíÖ±½ÓËõСÁË 23%¡£¡£¡£¡£Õâ²»µ«½ÚÔ¼Á˿ռ䣬£¬£¬£¬¸üÈÃ֪ʶµÄÃܶȴó·ùÌáÉý¡£¡£¡£¡£B. ¶àÍ·¹þÏ££º½â¾ö " ¹þÏ£³åÍ» "²»¿ÉÄܰÑËùÓÐ N-gram ¶¼´æÏÂÀ´¡£¡£¡£¡£Engram ʹÓÃÁË " ¶àÍ·¹þÏ££¨Multi-Head Hashing£©" ÊÖÒÕ¡£¡£¡£¡£Í¨¹ý¶à¸ö¹þÏ£º¯Êý£¬£¬£¬£¬½«ÎÞÏÞµÄ N-gram Ó³Éäµ½ÓÐÏÞµÄÄÚ´æ²ÛλÖС£¡£¡£¡£ËäÈ»»áÓйþÏ£³åÍ»£¨¼´Á½¸ö²î±ðµÄ´Ê±»Ó³Éäµ½ÁËͳһ¸öλÖã©£¬£¬£¬£¬µ«Í¨¹ý " ¶àÍ· " Éè¼Æ£¬£¬£¬£¬Ä£×Ó¿ÉÒÔ´Ó¶à¸öºòѡЧ¹ûÖÐÆ´¼¯³ö׼ȷµÄÐÅÏ¢£¬£¬£¬£¬¼«´óµØÌá¸ßÁ˳°ôÐÔ¡£¡£¡£¡£C. ÉÏÏÂÎÄÃſأº¸øÓ°ÏóÅä¸ö " ²ÃÅÐ "ÕâÊÇ×ÃîµÄÒ»±Ê¡£¡£¡£¡£²é±íÊÇËÀµÄ£¬£¬£¬£¬ÓïÑÔÊÇ»îµÄ¡£¡£¡£¡£ºÃ±È " Æ»¹û " Õâ¸ö´Ê¡£¡£¡£¡£ÔÚ " ³ÔÆ»¹û " µÄÓᄈϣ¬£¬£¬£¬Ëüָˮ¹û £»£»£»ÔÚ " Æ»¹ûÐû²¼»á " µÄÓᄈϣ¬£¬£¬£¬ËüÖ¸¿Æ¼¼¹«Ë¾¡£¡£¡£¡£Ö±½Ó²é±í¿ÉÄÜ»áÒýÈëÔëÉù¡£¡£¡£¡£DeepSeek Éè¼ÆÁËÒ»¸ö " ÉÏÏÂÎĸÐÖªÃÅ¿Ø "£¨Context-aware Gating£©¡£¡£¡£¡£Query£¨ÅÌÎÊ£©£ºÄ¿½ñÉÏÏÂÎĵÄÒþ²Ø×´Ì¬£¨Hidden State£©¡£¡£¡£¡£Key/Value£¨¼üÖµ£©£º²é±í»ñµÃµÄ¾²Ì¬ÏòÁ¿¡£¡£¡£¡£Õâ¸öÃſؾÍÏñÒ»¸ö²ÃÅС£¡£¡£¡£ÈôÊDzé³öÀ´µÄ " ¾²Ì¬ÖªÊ¶ " ºÍÄ¿½ñµÄ " ÉÏÏÂÎÄ " ²»´î£¬£¬£¬£¬²ÃÅоͻá°ÑÈ¨ÖØÑ¹µÍ£¨Gate ÖµÇ÷Ïò 0£©£¬£¬£¬£¬ÈÃÄ£×ÓºöÂÔÕâ¸öÔëÉù £»£»£»ÈôÊÇÍêÉÆÆõºÏ£¨ºÃ±È " É˺®ÔÓ²¡ÂÛ " ºóËæ×Å " ÕÅÖÙ¾° "£©£¬£¬£¬£¬²ÃÅоͻá°Ñ´óÃÅ·­¿ª£¨Gate ÖµÇ÷Ïò 1£©£¬£¬£¬£¬Ö±½Ó°Ñ֪ʶעÈëÄ£×Ó¡£¡£¡£¡£ µÚ¶þÕ£º»Æ½ð±ÈÀý¡ª¡ª·¢Ã÷ AI Ä£× "U ÐÍÇúÏß "¼Ü¹¹Éè¼ÆºÃÁË£¬£¬£¬£¬½ÓÏÂÀ´µÄÎÊÌâÊÇ£ºÔõô·Ö¾Ó²ú£¿£¿£¿¼ÙÉèÎÒÃÇÏÔ¿¨ÀïµÄÏÔ´æÊÇÓÐÏ޵쬣¬£¬£¬×ܲÎÊýÔ¤ËãÒ²ÊÇÀο¿µÄ¡£¡£¡£¡£ÎÒÃÇÓ¦¸Ã°Ñ¼¸¶à²ÎÊý·ÖÅ䏸 MoE µÄ " ר¼Ò "£¨ÈÏÕæÅÌË㣩£¬£¬£¬£¬¼¸¶à²ÎÊý·ÖÅ䏸 Engram µÄ " ×Öµä "£¨ÈÏÕæÓ°Ï󣩣¿£¿£¿ÕâÊÇÒ»¸öµä·¶µÄ×ÊÔ´ÉèÖò©ÞÄ¡£¡£¡£¡£DeepSeek ÍŶӾÙÐÐÁËÒ»³¡´ó¹æÄ£µÄÏûÈÚʵÑ飬£¬£¬£¬É¨ÃèÁË´Ó 0% µ½ 100% µÄ·ÖÅä±ÈÀý£¬£¬£¬£¬Ð§¹û»­³öÁËÒ»ÌõÍêÉÆµÄ "U ÐÍ Scaling Law ÇúÏß "¡£¡£¡£¡£ÕâÕÅͼչÏÖÁË AI Ä£×ÓÉè¼ÆµÄµ×²ã¼ÍÂÉ£º×ó²à¼«¶Ë£¨´¿ Engram£©£ºÈôÊǰѲÎÊýÈ«¸ø×ֵ䣬£¬£¬£¬Loss ºÜ¸ß¡£¡£¡£¡£ÓÉÓÚÄ£×ÓÄð³ÉÁË " Êé°×³Õ "£¬£¬£¬£¬¹âÓÐËÀ¼ÇÓ²±³£¬£¬£¬£¬Ã»ÓÐÂß¼­ÍÆÀíÄÜÁ¦¡£¡£¡£¡£ÓҲ༫¶Ë£¨´¿ MoE£©£ºÈôÊǰѲÎÊýÈ«¸ø×¨¼Ò£¬£¬£¬£¬Loss Ò²ºÜ¸ß¡£¡£¡£¡£ÓÉÓÚר¼ÒÃDZ»ÆÈ°Ñ¾«Éñ¶¼»¨ÔÚ±³Ê飨ӰÏó¾²Ì¬ÖªÊ¶£©ÉÏ£¬£¬£¬£¬Ã»¿Õ¸ÉÕýÊ¡£¡£¡£¡ £»£»£»Æ½ðÖ§½âµã£¨¦Ñ ¡Ö 75%-80%£©£ºµ±ÎÒÃǽ«Ô¼20%-25% µÄÏ£º±²ÎÊýÔ¤Ëã·Ö¸ø Engram£¬£¬£¬£¬Ê£Ïµĸø MoE ʱ£¬£¬£¬£¬Ä£×ÓµÄÑéÖ¤¼¯ Loss ½µµ½ÁË×îµÍµã¡£¡£¡£¡£ÕâÊÇÒ»¸ö¼«¾ßÖ¸µ¼ÒâÒåµÄ·¢Ã÷£º¹ØÓÚ¼¸°ÙÒÚ²ÎÊýµÄ´óÄ£×ÓÀ´Ëµ£¬£¬£¬£¬´¿´â¶ÑÆöÅÌË㵥루MoE ר¼Ò£©ÒѾ­ÊDZ߼ÊЧӦµÝ¼õÁË£¬£¬£¬£¬±ØÐèÒýÈëרÃŵľ²Ì¬Ó°ÏóÄ£¿£¿£¿éÀ´ÊµÏÖ " ´æËãÆ½ºâ "¡£¡£¡£¡£ µÚÈýÕ£º·´Ö±¾õµÄ±¬·¢¡ª¡ªÎªÊ²Ã´ " ²é×Öµä " ÄÜÌá¸ß " ÊýѧЧ¹û "£¿£¿£¿ÈôÊÇ Engram ½ö½öÊÇÈÃÄ£×Ó " ¼ÇÐÔ¸üºÃ "£¬£¬£¬£¬ÕâÆªÂÛÎĵķÖÁ¿»¹È±·¦ÒÔÕð¾ªÉçÇø¡£¡£¡£¡£ÊÂʵ£¬£¬£¬£¬RAG£¨¼ìË÷ÔöÇ¿ÌìÉú£©Ò²Äܽâ¾ö֪ʶÎÊÌâ¡£¡£¡£¡£ÕæÕýÈÃÒµ½ç¸ÐÓ¦Õ𺳵Ä£¬£¬£¬£¬ÊÇʵÑéЧ¹ûÖÐÄÇЩÒâÁÏÖ®ÍâµÄÊÕÒæ¡£¡£¡£¡£DeepSeek ¹¹½¨ÁËÈý¸ö±ÈÕÕÄ£×Ó£¬£¬£¬£¬ÑÏ¿á¿ØÖÆ¼¤»î²ÎÊýÄ¿£¨3.8B£©ºÍѵÁ·Êý¾ÝÁ¿£¨262B tokens£©ÍêȫһÖ£ºDense-4B£º¹Å°åµÄŨÃÜÄ£×Ó¡£¡£¡£¡£MoE-27B£º´¿ MoE Ä£×Ó£¨72 ¸öר¼Ò£©¡£¡£¡£¡£Engram-27B£º»ìÏýÄ£×Ó£¨55 ¸öר¼Ò + 5.7B Engram ²ÎÊý£©¡£¡£¡£¡£Ð§¹ûÁîÈË´óµøÑÛ¾µ£º1. ÒâÁÏÖ®ÖУºÖªÊ¶ÀàʹÃü°Ô°ñÔÚ MMLU£¨×ÛºÏ֪ʶ£©ÉÏ£¬£¬£¬£¬Engram Ä£×ÓÌáÉýÁË3.4 ·Ö £»£»£»ÔÚ CMMLU£¨ÖÐÎÄ֪ʶ£©ÉÏ£¬£¬£¬£¬ÌáÉýÁË4.0 ·Ö¡£¡£¡£¡£ÕâºÜºÃÃ÷È·£¬£¬£¬£¬Íâ¹ÒÁË×ֵ䣬£¬£¬£¬ÖªÊ¶×ÔÈ»¸üºÃÁË£¬£¬£¬£¬»Ã¾õ¸üÉÙÁË¡£¡£¡£¡£2. ÒâÁÏÖ®Í⣺Âß¼­¡¢´úÂë¡¢ÊýѧÖÜÈ«±©Õǰ´Àí˵£¬£¬£¬£¬" ²é×Öµä " ºÍ " ×öÊýѧÌâ " û¹ØÏµ¡£¡£¡£¡£µ«ÔÚ BBH£¨×ÛºÏÍÆÀí£©ÉÏ£¬£¬£¬£¬Engram-27B ¾¹È»±Èͬ²ÎÊýµÄ´¿ MoE »ùÏßÌáÉýÁËÕûÕû5.0 ·Ö£¡MATH£¨Êýѧ£©£ºÌáÉý2.4 ·Ö¡£¡£¡£¡£HumanEval£¨´úÂëÌìÉú£©£ºÌáÉý3.0 ·Ö¡£¡£¡£¡£ARC-Challenge£¨ÖØ´óÍÆÀí£©£ºÌáÉý3.7 ·Ö¡£¡£¡£¡£3. Éî¶ÈÆÊÎö£ºÓÐÓÃÉî¶È£¨Effective Depth£©ÀíÂÛΪʲô£¿£¿£¿Ò»¸ö " ËÀ¼ÇÓ²±³ " µÄÄ£¿£¿£¿é£¬£¬£¬£¬ÎªÊ²Ã´ÄÜÌá¸ßÖÇÉÌ£¿£¿£¿DeepSeek ÍŶÓʹÓÃLogitLensºÍ "CKA£¨ÖÐÐÄºË¶ÔÆë£©" ÊÖÒÕ£¬£¬£¬£¬¶ÔÄ£×ÓÄÚ²¿¾ÙÐÐÁË " ÆÊ½â "¡£¡£¡£¡£ËûÃÇ·¢Ã÷ÁËÒ»¸ö¾ªÈ˵ÄÕ÷Ï󣺻¹¼ÇµÃ¿ªÍ·µÄ " ´÷°²ÄÈÍõåú " Â𣿣¿£¿ÔÚ´¿ MoE Ä£×ÓÖУ¬£¬£¬£¬Ç°¼¸²ãÍøÂç¶¼ÔÚæ×Å " Æ´¼¯¿´·¨ "¡£¡£¡£¡£¶øÔÚ Engram Ä£×ÓÖУ¬£¬£¬£¬ÓÉÓÚµÚ 2 ²ã¾Í²åÈëÁË Engram Ä£¿£¿£¿é£¬£¬£¬£¬¾²Ì¬ÖªÊ¶µÄ¼ìË÷ÔÚ¼«ÔçµÄ½×¶Î¾ÍÍê³ÉÁË¡£¡£¡£¡£ÕâÒâζ×Å£¬£¬£¬£¬Ô­±¾ÓÃÓÚ " ËÀ¼ÇÓ²±³ " µÄǰ¼¸²ãÍøÂç±»½â·ÅÁË£¡ÕâÏ൱ÓÚ¸øÄ£×Ó " ÐéÔö " ÁËÉî¶È¡£¡£¡£¡£ ÄÇЩ±»ÊͷųöÀ´µÄÍøÂç²ãºÍ×¢ÖØÁ¦Í·£¨Attention Heads£©£¬£¬£¬£¬²»ÔÙÐèÒª´¦ÀíààËյľֲ¿ÒÀÀµ£¨ºÃ±Èʶ±ð " ÕÅÖÙ¾° " ÊÇË­£©£¬£¬£¬£¬´Ó¶ø¿ÉÒÔÈ«Éñ¹á×¢µØÍ¶Èëµ½¸üÖØ´óµÄÈ«¾ÖÍÆÀí¡¢³¤³ÌÂß¼­¹¹½¨ºÍ´úÂëÂß¼­ÌìÉúÖÐÈ¥¡£¡£¡£¡£Engram µÄʵÖÊ£¬£¬£¬£¬²»ÊÇ " Ìæ»» " ÍÆÀí£¬£¬£¬£¬¶øÊÇͨ¹ý " ·ÖÁ÷ " Ôӻ£¬£¬£¬ÈôóÄÔרעÓÚ¸ü¸ßά¶ÈµÄ˼Ë÷¡£¡£¡£¡£ µÚËÄÕ£º¹¤³ÌÊÂÒµ¡ª¡ªÍ»ÆÆÓ¢Î°´ïµÄ " ÏÔ´æ°ÔȨ "¹ØÓÚ»ª¶û½ÖµÄͶ×ÊÕߺÍËãÁ¦ÖÐÐĵÄÔËάÕßÀ´Ëµ£¬£¬£¬£¬ÕâÆªÂÛÎÄ×îÐԸеĵط½²»ÔÚÓÚ Score£¬£¬£¬£¬¶øÔÚÓÚCost£¨±¾Ç®£©¡£¡£¡£¡£ÔÚ AI ʱ´ú£¬£¬£¬£¬×îÌÚ¹óµÄ×ÊÔ´²»ÊÇËãÁ¦£¨FLOPs£©£¬£¬£¬£¬¶øÊÇÏԴ棨HBM£©¡£¡£¡£¡£Ó¢Î°´ï H100 Ö®ÒÔÊǹ󣬣¬£¬£¬ºÜºéÁ÷ƽÉÏÊÇÓÉÓÚÄÇϡȱµÄ HBM3e ÄÚ´æ¡£¡£¡£¡£¶ø Engram ´øÀ´ÁËÒ»¸öÇ㸲ÐÔµÄÌØÕ÷£º³¹µ×µÄ´æËãÊèÉ¢¡£¡£¡£¡£1. MoE µÄÍ´µã£ºÏÔ´æÍÌÊÉÕ߹ŰåµÄ MoE Ä£×Ó£¬£¬£¬£¬Æä·ÓÉ»úÖÆ£¨Routing£©ÊǶ¯Ì¬µÄ¡£¡£¡£¡£Ä£×Ó±ØÐèÏÈËã³öÄ¿½ñ Token µÄÌØÕ÷£¬£¬£¬£¬ËãÍêÕâÒ»²ã£¬£¬£¬£¬²ÅÖªµÀÏÂÒ»²ã¸ÃÕÒÄĸöר¼Ò¡£¡£¡£¡£ÕâÒâζ×Å£¬£¬£¬£¬ËùÓеÄר¼ÒÄ£×Ó±ØÐèʱ¿ÌÔÚÌÚ¹óµÄ GPU ÏÔ´æÀï´ýÃü£¬£¬£¬£¬Ëæ½ÐËæµ½¡£¡£¡£¡£2. Engram µÄÍ»ÆÆ£ºÈ·¶¨µÄÔ¤ÖªEngram µÄ²é±íÂß¼­ÊÇÈ·¶¨ÐԵġ£¡£¡£¡£Ö»ÒªÊäÈëµÄÎı¾È·¶¨ÁË£¨ºÃ±È "A New Axis of Sparsity"£©£¬£¬£¬£¬ÄÇôËü¶ÔÓ¦µÄ N-gram Ë÷Òý¾ÍÈ·¶¨ÁË¡£¡£¡£¡£ÎÒÃÇ»ù´¡²»ÐèÒªµÈÄ£×ÓËãÍêǰһ²ã£¬£¬£¬£¬ÔÚ Token ½øÈëÄ£×ÓµÄÄÇһ˲¼ä£¬£¬£¬£¬ÎÒÃǾÍÖªµÀËüÐèÒª²éÄÄÕűíµÄÄÄÒ»ÐС£¡£¡£¡£3. CPU µÄÄæÏ®£º°Ñ´óÄ£×ÓÈû½øÄÚ´æÌõÕâÒ»ÌØÕ÷´øÀ´ÁËÖØ´óµÄ¹¤³ÌÓ¯Àû£ºÐ¶ÔØ£¨Offload£©£ºÎÒÃÇ¿ÉÒ԰Ѽ¸°ÙÒÚ¡¢ÉõÖÁÉÏǧÒÚ²ÎÊýµÄ Engram ´Ê±í£¬£¬£¬£¬Ö±½ÓÈÓµ½×ÔÖÆ¡¢Á¿´ó¡¢Ò×À©Õ¹µÄ "CPU Äڴ棨DRAM£©" À£¬£¬£¬ÉõÖÁ·ÅÔÚ NVMe SSD ÉÏ¡£¡£¡£¡£Ô¤È¡£¡£¡£¡£¨Prefetching£©£ºÔÚ GPU Æ´ÃüÅÌËãǰһ²ã Transformer µÄʱ¼ä£¬£¬£¬£¬CPU ʹÓà PCIe ͨµÀ£¬£¬£¬£¬Òì²½µØ°ÑÏÂÒ»²ãÐèÒªµÄÓ°ÏóÊý¾Ý " Ԥȡ " ³öÀ´£¬£¬£¬£¬ÍÆË͵½ GPU¡£¡£¡£¡£ÑÚÊÎÑÓ³Ù£¬£¬£¬£¬²¢Ðд¦Àí¡£¡£¡£¡£DeepSeek ʵ²âÊý¾ÝÏÔʾ£º×ÝÈ»¹ÒÔØÁË100B£¨Ç§ÒÚ£©²ÎÊýµÄ Engram ±íµ½ CPU Äڴ棬£¬£¬£¬Ïà±ÈÓÚ´¿ GPU ÍÆÀí£¬£¬£¬£¬ÍÌÍÂÁ¿µÄϽµ²»µ½ 3%¡£¡£¡£¡£ÕâÊÇÒ»¸öÈÃËùÓÐÓÉÓÚÂò²»µ½ HBM ¶ø½¹ÂǵÄÈË¿ñϲµÄ½áÂÛ¡£¡£¡£¡£ÕâÒâζ×Å£¬£¬£¬£¬Î´À´µÄ´óÄ£×Ó£¬£¬£¬£¬" Ó°ÏóÈÝÁ¿ " ¿ÉÒԵͳÉÍâµØÎÞÏÞÀ©ÕÅ£¬£¬£¬£¬¶ø²»±Ø±»Ó¢Î°´ïµÄÏԴ濨²±×Ó¡£¡£¡£¡£ µÚÎåÕ£º³¤Îı¾µÄʤÀû¡ª¡ª NIAH ²âÊÔµÄÔ¾Éý³ýÁËͨÓÃÍÆÀí£¬£¬£¬£¬Engram ÔÚ³¤Îı¾£¨Long Context£©ÁìÓòµÄÌåÏÖͬÑù֤ʵÎú " ·Ö¹¤ " µÄ¼ÛÖµ¡£¡£¡£¡£ÔÚ³¤Îı¾´¦ÀíÖУ¬£¬£¬£¬×¢ÖØÁ¦»úÖÆ£¨Attention£©µÄ´°¿ÚÊÇÓÐÏ޵ġ£¡£¡£¡£ÈôÊÇ×¢ÖØÁ¦±»´ó×ڵľֲ¿ÐÅÏ¢£¨ÈçÀο¿¶Ì

    ±¾ÎÄÁ´½Ó£º?/p/Products/4138996.html

    °Ù¶ÈÔÊÐí£ºÈçÓöÐéαڲƭ£¬£¬£¬£¬ÖúÄú****(Ôð±à£º³ÂÞÈÔ£¡¢µËΰÏè)

    Ïà¹ØÓ¦ÓÃ

    ¡¾ÍøÕ¾µØÍ¼¡¿
    ios°æ-ÌìÏÂ24Сʱ¿ÕսʤÎñƽ̨Èí¼þ-ÆÕ¾©Èô±»°ó¼Ü£