|
人工智能縱橫談 – 開欄文
|
瀏覽13,719 |回應50 |推薦2 |
|
|
|
四月開始,由於 ChatGPT 和 Bing Chat 的上線,網上以及各line群組掀起一陣AI瘋。我當時大概忙於討論《我們的反戰聲明》,沒有湊這個熱鬧。現在轉載幾篇相關文章。也請參考《「人工智慧」研發現況及展望 》一文,以及此欄2025/08/11貼文。 有些人擔憂「人工智慧」會成為「人上機器」,操控世界甚至奴役人類。我不懂AI,思考也單純;所以,如果「人工智慧」亂了套,我自認為有一個簡單治它的方法: 拔掉電源插頭。如果這個方法不夠力,炸掉電力傳輸線和緊急發電機;再不行,炸掉發電廠。
本文於 修改第 5 次
|
人工智能公司泡沫化危機? -- Nick Lichtenberg
|
|
|
推薦1 |
|
|
|
下文蠻有看頭和存檔備查價值;作者大量旁徵博引之外,它還穿插了一小段「技術-經濟互動簡史」;兩者在技術性報導中並不多見。不過,人工智能技術潛能和人工智能企業前景到底是風光無限,還是前途多舛?下文並未提供定論;相信絕大多數人也都只有能力觀望、揣測、或摸瞎。 對人工智能有濃厚興趣的朋友,不妨參看:此文--該欄2025/08/16、此文--該欄2025/08/11、和本欄2025/08/24、2025/07/16、2025/06/12等篇貼文。 ‘It’s almost tragic’: Bubble or not, the AI backlash is validating what one researcher and critic has been saying for years Nick Lichtenberg, 08/24/25 First it was the release of GPT-5 that OpenAI “totally screwed up,” according to Sam Altman. Then Altman followed that up by saying the B-word at a dinner with reporters. “When bubbles happen, smart people get overexcited about a kernel of truth,” The Verge reported on comments by the OpenAI CEO. Then it was the sweeping MIT survey that put a number on what so many people seem to be feeling: a whopping 95% of generative AI pilots at companies are failing. A tech sell-off ensued, as rattled investors sent the value of the S&P 500 down by $1 trillion. Given the increasing dominance of that index by tech stocks that have largely transformed into AI stocks, it was a sign of nerves that the AI boom was turning into dotcom bubble 2.0. To be sure, fears about the AI trade aren’t the only factor moving markets, as evidenced by the S&P 500 snapping a five-day losing streak on Friday after Jerome Powell’s quasi-dovish comments at Jackson Hole, Wyoming, as even the hint of openness from the Fed chair toward a September rate cut set markets on a tear. Gary Marcus has been warning of the limits of large language models (LLMs) since 2019 and warning of a potential bubble and problematic economics since 2023. His words carry a particularly distinctive weight. The cognitive scientist turned longtime AI researcher has been active in the machine learning space since 2015, when he founded Geometric Intelligence. That company was acquired by Uber in 2016, and Marcus left shortly afterward, working at other AI startups while offering vocal criticism of what he sees as dead-ends in the AI space. Still, Marcus doesn’t see himself as a “Cassandra,” (沒人相信的預言家) and he’s not trying to be, he told Fortune in an interview. Cassandra, a figure from Greek tragedy, was a character who uttered accurate prophecies but wasn’t believed until it was too late. “I see myself as a realist and as someone who foresaw the problems and was correct about them.” Marcus attributes the wobble in markets to GPT-5 above all. It’s not a failure, he said, but it’s “underwhelming,” a “disappointment,” and that’s “really woken a lot of people up. You know, GPT-5 was sold, basically, as AGI (人工全方位智能), and it just isn’t,” he added, referencing artificial general intelligence, a hypothetical AI with human-like reasoning abilities. “It’s not a terrible model, it’s not like it’s bad,” he said, but “it’s not the quantum leap that a lot of people were led to expect.” Marcus said this shouldn’t be news to anyone paying attention, as he argued in 2022 that “deep learning is hitting a wall.” To be sure, Marcus has been wondering openly on his Substack on when the generative AI bubble will deflate. He told Fortune that “crowd psychology” is definitely taking place, and he thinks every day about the John Maynard Keynes quote: “The market can stay irrational longer than you can stay solvent,” or Looney Tunes’s Wile E. Coyote following Road Runner off the edge of a cliff and hanging in midair, before falling down to Earth. “That’s what I feel like,” Marcus says. “We are off the cliff. This does not make sense. And we get some signs from the last few days that people are finally noticing.” Building warning signs The bubble talk began heating up in July, when Apollo Global Management’s chief economist, Torsten Slok, widely read and influential on Wall Street, issued a striking calculation while falling short of declaring a bubble. “The difference between the IT bubble in the 1990s and the AI bubble today is that the top 10 companies in the S&P 500 today are more overvalued than they were in the 1990s,” he wrote, warning that the forward P/E ratios and staggering market capitalizations of companies such as Nvidia, Microsoft, Apple, and Meta had “become detached from their earnings.” In the weeks since, the disappointment of GPT-5 was an important development, but not the only one. Another warning sign is the massive amount of spending on data centers to support all the theoretical future demand for AI use. Slok has tackled this subject as well, finding that data center investments’ contribution to GDP growth has been the same as consumer spending over the first half of 2025, which is notable since consumer spending makes up 70% of GDP. (The Wall Street Journal‘s Christopher Mims had offered the calculation weeks earlier.) Finally, on August 19, former Google CEO Eric Schmidt co-authored a widely discussed New York Times op-ed on August 19, arguing that “it is uncertain how soon artificial general intelligence can be achieved.” This is a significant about-face, according to political scientist Henry Farrell, who argued in the Financial Times in January that Schmidt was a key voice shaping the “New Washington Consensus,” predicated in part on AGI being “right around the corner.” On his Substack, Farrell said Schmidt’s op-ed shows that his prior set of assumptions are “visibly crumbling away,” while caveating that he had been relying on informal conversations with people he knew in the intersection of D.C. foreign policy and tech policy. Farrell’s title for that post: “The twilight of tech unilateralism.” He concluded: “If the AGI bet is a bad one, then much of the rationale for this consensus falls apart. And that is the conclusion that Eric Schmidt seems to be coming to.” Finally, the vibe is shifting in the summer of 2025 into a mounting AI backlash. Darrell West warned in Brookings in May that the tide of both public and scientific opinion would soon turn against AI’s masters of the universe. Soon after, Fast Company predicted the summer would be full of “AI slop.” By early August, Axios had identified the slang “clunker” being applied widely to AI mishaps, particularly in customer service gone awry. History says: short-term pain, long-term gain John Thornhill of the Financial Times offered some perspective on the bubble question, advising readers to brace themselves for a crash, but to prepare for a future “golden age” of AI nonetheless. He highlights the data center buildout—a staggering $750 billion investment from Big Tech over 2024 and 2025, and part of a global rollout projected to hit $3 trillion by 2029. Thornhill turns to financial historians for some comfort and some perspective. Over and over, it shows that this type of frenzied investment typically triggers bubbles, dramatic crashes, and creative destruction—but that eventually durable value is realized. He notes that Carlota Perez documented this pattern in Technological Revolutions and Financial Capital: The Dynamics of Bubbles and Golden Ages. She identified AI as the fifth technological revolution to follow the pattern begun in the late 18th century, as a result of which the modern economy now has railroad infrastructure and personal computers, among other things. Each one had a bubble and a crash at some point. Thornhill didn’t cite him in this particular column, but Edward Chancellor documented similar patterns in his classic Devil Take The Hindmost, a book notable not just for its discussions of bubbles but for predicting the dotcom bubble before it happened. Owen Lamont of Acadian Asset Management cited Chancellor in November 2024, when he argued that a key bubble moment had been passed: an unusually large number of market participants saying that prices are too high, but insisting that they’re likely to rise further. Wall Street is cautious, but not calling a bubble Wall Street banks are largely not calling for a bubble. Morgan Stanley released a note recently seeing huge efficiencies ahead for companies as a result of AI: $920 billion per year for the S&P 500. UBS, for its part, concurred with the caution flagged in the news-making MIT research. It warned investors to expect a period of “capex indigestion” accompanying the data center buildout, but it also maintained that AI adoption is expanding far beyond expectations, citing growing monetization from OpenAI’s ChatGPT, Alphabet’s Gemini, and AI-powered CRM systems. Bank of America Research wrote a note in early August, before the launch of GPT-5, seeing AI as part of a worker productivity “sea change” that will drive an ongoing “innovation premium” for S&P 500 firms. Head of U.S. Equity Strategy Savita Subramanian essentially argued that the inflation wave of the 2020s taught companies to do more with less, to turn people into processes, and that AI will turbo-charge this. “I don’t think it’s necessarily a bubble in the S&P 500,” she told Fortune in an interview, before adding, “I think there are other areas where it’s becoming a little bit bubble-like.” Subramanian mentioned smaller companies and potentially private lending as areas “that potentially have re-rated too aggressively.” She’s also concerned about the risk of companies diving into data centers to such a great extent, noting that this represents a shift back toward an asset-heavier approach, instead of the asset-light approach that increasingly distinguishes top performance in the U.S. economy. “I mean, this is new,” she said. “Tech used to be very asset-light and just spent money on R&D and innovation, and now they’re spending money to build out these data centers,” adding that she sees it as potentially marking the end of their asset-light, high-margin existence and basically transforming them into “very asset-intensive and more manufacturing-like than they used to be.” From her perspective, that warrants a lower multiple in the stock market. When asked if that is tantamount to a bubble, if not a correction, she said “it’s starting to happen in places,” and she agrees with the comparison to the railroad boom. The math and the ghost in the machine Gary Marcus also cited the fundamentals of math as a reason that he’s concerned, with nearly 500 AI unicorns being valued at $2.7 trillion. “That just doesn’t make sense relative to how much revenue is coming [in],” he said. Marcus cited OpenAI reporting $1 billion in revenue in July, but still not being profitable. Speculating, he extrapolated that to OpenAI having roughly half the AI market, and offered a rough calculation that it means about $25 billion a year of revenue for the sector, “which is not nothing, but it costs a lot of money to do this, and there’s trillions of dollars [invested].” So if Marcus is correct, why haven’t people been listening to him for years? He said he’s been warning people about this for years, too, calling it the “gullibility gap” in his 2019 book Rebooting AI and arguing in The New Yorker in 2012 that deep learning was a ladder that wouldn’t reach the moon. For the first 25 years of his career, Marcus trained and practiced as a cognitive scientist, and learned about the “anthropomorphization people do. … [they] look at these machines and make the mistake of attributing to them an intelligence that is not really there, a humanness that is not really there, and they wind up using them as a companion, and they wind up thinking that they’re closer to solving these problems than they actually are.” He said he thinks the bubble inflating to its current extent is in large part because of the human impulse to project ourselves onto things, something a cognitive scientist is trained not to do. These machines might seem like they’re human, but “they don’t actually work like you,” Marcus said, adding, “this entire market has been based on people not understanding that, imagining that scaling was going to solve all of this, because they don’t really understand the problem. I mean, it’s almost tragic.” Subramanian, for her part, said she thinks “people love this AI technology because it feels like sorcery. It feels a little magical and mystical … the truth is it hasn’t really changed the world that much yet, but I don’t think it’s something to be dismissed.” She’s also become really taken with it herself. “I’m already using ChatGPT more than my kids are. I mean, it’s kind of interesting to see this. I use ChatGPT for everything now.” This story was originally featured on Fortune.com
本文於 修改第 1 次
|
MIT報告:人工智能試行計畫失敗率高達95% -- Sheryl Estrada
|
|
|
推薦1 |
|
|
|
請參考: Sam Altman admits OpenAI ‘totally screwed up’ its GPT-5 launch and says the company will spend trillions of dollars on data centers MIT report: 95% of generative AI pilots at companies are failing Sheryl Estrada, 08/18/25 Good morning. Companies are betting on AI—yet nearly all enterprise pilots are stuck at the starting line. The GenAI Divide: State of AI in Business 2025, a new report published by MIT’s NANDA initiative, reveals that while generative AI holds promise for enterprises, most initiatives to drive rapid revenue growth are falling flat. Despite the rush to integrate powerful new models, about 5% of AI pilot programs achieve rapid revenue acceleration; the vast majority stall, delivering little to no measurable impact on P&L. The research—based on 150 interviews with leaders, a survey of 350 employees, and an analysis of 300 public AI deployments—paints a clear divide between success stories and stalled projects. To unpack these findings, I spoke with Aditya Challapally, the lead author of the report, and a research contributor to project NANDA at MIT. “Some large companies’ pilots and younger startups are really excelling with generative AI,” Challapally said. Startups led by 19- or 20-year-olds, for example, “have seen revenues jump from zero to $20 million in a year,” he said. “It’s because they pick one pain point, execute well, and partner smartly with companies who use their tools,” he added. But for 95% of companies in the dataset, generative AI implementation is falling short. The core issue? Not the quality of the AI models, but the “learning gap” for both tools and organizations. While executives often blame regulation or model performance, MIT’s research points to flawed enterprise integration. Generic tools like ChatGPT excel for individuals because of their flexibility, but they stall in enterprise use since they don’t learn from or adapt to workflows, Challapally explained. The data also reveals a misalignment in resource allocation. More than half of generative AI budgets are devoted to sales and marketing tools, yet MIT found the biggest ROI in back-office automation—eliminating business process outsourcing, cutting external agency costs, and streamlining operations. What’s behind successful AI deployments? How companies adopt AI is crucial. Purchasing AI tools from specialized vendors and building partnerships succeed about 67% of the time, while internal builds succeed only one-third as often. This finding is particularly relevant in financial services and other highly regulated sectors, where many firms are building their own proprietary generative AI systems in 2025. Yet, MIT’s research suggests companies see far more failures when going solo. Companies surveyed were often hesitant to share failure rates, Challapally noted. “Almost everywhere we went, enterprises were trying to build their own tool,” he said, but the data showed purchased solutions delivered more reliable results. Other key factors for success include empowering line managers—not just central AI labs—to drive adoption, and selecting tools that can integrate deeply and adapt over time. Workforce disruption is already underway, especially in customer support and administrative roles. Rather than mass layoffs, companies are increasingly not backfilling positions as they become vacant. Most changes are concentrated in jobs previously outsourced due to their perceived low value. The report also highlights the widespread use of “shadow AI”—unsanctioned tools like ChatGPT—and the ongoing challenge of measuring AI’s impact on productivity and profit. Looking ahead, the most advanced organizations are already experimenting with agentic AI systems that can learn, remember, and act independently within set boundaries—offering a glimpse at how the next phase of enterprise AI might unfold. Sheryl Estrada, sheryl.estrada@fortune.com This story was originally featured on Fortune.com
本文於 修改第 1 次
|
人工智能效應:高科技企業今年已裁撤5萬職位-Cryptopolitan
|
|
|
推薦1 |
|
|
|
請參考: * Enterprises Push Ahead with AI-Powered Job Replacement Despite Risks * This CEO laid off nearly 80% of his staff because they refused to adopt AI fast enough. 2 years later, he says he’d do it again 60歲以下的白領階層員工需要提高警覺。由於薪資積累以及生產力降低(由於腦力/體力衰退)等因素,40歲以上的「老」員工更是屬於「高風險群」;上述兩個因素是我親身經驗之談。 Around 50,000 tech jobs replaced by AI as companies restructure for the future Cryptopolitan, 08/14/25 Around 50,000 tech job roles have been replaced with AI as companies like Oracle, Intel, Microsoft, and Nextdoor restructured their workforces. The nationwide layoffs have been attributed to firms pursuing large-scale investments in artificial intelligence and managing the AI-related costs. Oracle confirmed changes in its cloud division, cutting off more than 150 workers in Seattle, the base of its cloud operations. Some of the affected roles include engineering and support positions. Oracle informed staff that performance factors contributed to their release. However, the company continues to hire in other business units. Tech companies slash jobs to boost AI and cloud spending Oracle failed to provide an official statement regarding the number of employees laid off. However, the company stated in June that restructuring and workforce changes occur periodically concerning company strategies, operational reorganizations, and performance reviews. The report revealed that the actions would result in high near-term costs and decreased productivity as employees seek to adjust to the new structure. Oracle’s stock has traded near record highs recently following the growth in its cloud business. In June, it announced an agreement with OpenAI to secure 4.5 gigawatts of data center power capacity in the U.S. The deal forms part of initiatives to support AI workloads that require large compute power capacity. The database pioneer has committed tens of billions towards constructing new data centers that meet AI-related tasks. In its Q2 results, it reported a negative free cash flow reflecting the scale of investment it has entered into. Big tech firms are making similar moves industry-wide. Microsoft has laid off approximately 15,000 employees since the beginning of 2025. The affected roles include teams across engineering, sales, and support. Amazon and Meta have also laid off staff in 2025. Scale AI, a San Francisco-based firm specializing in AI data labelling, Nextdoor, the neighborhood-focused social networking platform, and Intel have all announced reductions in staff this year. Since January, the industry-wide layoffs have contributed to approximately 50,000 jobs lost across the technology sector. In June, Microsoft outlined plans to lay off multiple roles following the 6,000 job cuts in May. At least 22,000 jobs had already been cut in 2025 while 150,000 jobs were lost in 2024 across hundreds of companies. The job cuts occur as companies rush to expand spending in AI development and infrastructure. Amazon and Meta are investing more than $100 billion in AI. Industry-wide job cuts spread across multiple geographic locations Oracle stated that strategy changes and performance outcomes influence workforce changes. It also noted that although such measures are intended to provide financial caution, they may involve transitional costs and short-term productivity issues. Industry analysts have noted that layoffs have affected various roles, from entry-level technical positions to senior management. The effects of the layoffs have been distributed across different locations, with much concentration in the Seattle area, the San Francisco Bay Area, and Austin, Texas. Amid the layoffs, companies continue to recruit in other business units. Oracle’s cloud division indicates that its recruiting positions are aligned with priority growth areas. Roles related to AI infrastructure development, data centre operations, and high-demand enterprise software services are some of the key areas receiving employees. Cryptopolitan reported that about 41% of executives expect the tech workforce to shrink by 2030 due to AI automation. The report noted Nvidia’s CEO Jensen Huang’s remarks that while AI will make some jobs obsolete, it will also create new ones, and industries could grow if fresh ideas keep emerging.
本文於 修改第 2 次
|
5個妙招讓ChatGPT發揮淋漓盡至-- Sejalbaranwal
|
|
|
推薦1 |
|
|
|
別只看不練!立刻現學現賣!
I Use These 5 ChatGPT Prompts Daily - You'll Wish You Knew These Sooner Sejalbaranwal, 07/29/25 There's always a need to understand the truth. But sometimes, it hits you only after you have wasted a whole month. I spent hours- no, days- just starting at the screen. Trying to write blogs, emails, LinkedIn posts, create course layouts, and even summarize my study notes. Even simple decisions, like scheduling my day, became time-consuming. Why? Because I was stuck in the old-school mindset: Do it all yourself. Work harder, not smarter. And it led to just one thing: burnout. In just 10 days, I was completely exhausted. The process of thinking, writing, rewriting and finding someone to give feedback. it drained me. I knew about ChatGPT. But I didn't use it. Why? Because of overconfidence. Because I thought: "AI won't help unless I work hard myself." And yes, if you give vague prompts (指令), ChatGPT will give vague answers. But sometimes, we have to fall flat on our face to learn the truth. One Month Wasted. Then a Day That Changed Everything. I realized: working all day is not the same as being productive. I was getting nothing done. No satisfaction. No results. Then one day, I asked ChatGPT the right questions. And suddenly, things started shifting. Want to know how? Here are 5 mind-blowing prompts that changed how I work. These will unlock your productivity too. 1. Writing Faster Than a Bullet Train First, I took inspiration from the amazing Medium writers. I started creating first drafts in no time. Here are the magic prompts: 1. Provide me with underrated angles of the given topic. 2. Give the blog outline in AIDA format (Attention, Interest, Desire, Action). 3. Summarize real-life case studies and statistics. ChatGPT gives everything in seconds. But I don't copy-paste. Here's how I use it: I arrange the content in my style of writing. I prefer short, powerful sentences. I still do thorough research. So no, AI doesn't replace my thinking - it accelerates it. 2. Email Writing - It Hurts Screenshot of the GPT writing emails. 請至原網頁觀看螢幕截圖 Writing like I speak? Easy. Writing to sound professional? Not my cup of tea. So I started using GPT Email Writer. I just type my message and ask: Write this in a professional tone. Done. Polished. Clear. Respectful. It literally saves me from awkward backspacing for hours. 3. Codeless Websites on the Rise Everyone's talking about building websites. Few talk about doing it without writing a single line of code. After my coder friend let me down, I went to YouTube: How to build a website without coding? I discovered DeepSite- a web development tool powered by AI. I typed this prompt: Create a portfolio website that looks modern, impressive, and up-to-date for a content writer. In seconds, my website was ready. And honestly? It looked great. You should try it - you'll be surprised how professional it feels. 4. Creating Images Like a Pro We often give vague prompts like: Create a picture of a sunset below the horizon with a river. Press enter or click to view image in full size 請至原網頁觀看AI圖片 But that's not how to get the best out of image-generation AI. Here's the improved version: Generate a scenic sunset over a mountain range with a river having profound depths. The sky is filled with my mysterious, stormy clouds. The atmosphere looks fiery with fierce reds and vivid orange. The scene should feels overwhelming - like real nature. Press enter or click to view image in full size 請至原網頁觀看AI圖片 The result? Mind-blowing. Detail, colors, emotions - all come alive. Lesson: Precision prompts = pro-quality visuals. 5. Started My Course Creation Journey Press enter or click to view image in full size 請至原網頁觀看視頻和螢幕截圖 Creating a full course? It takes months - I thought. Then I read about a collaboration between Blue Carrot and GenEd AI: They created two courses - one traditionally, and one with AI. Both got similar ratings. But the AI course? It was done in just 1 week. It was 25 times faster. I was triggered by this article in a good way. I created two test videos. The AI-assisted one got selected. Here's the prompt that helped me plan my course: Give me stepwise ideas for publishing a Course on Udemy. I'll share my research on how competitors are structuring theirs. Your job is to arrange everything in sequence. Provide a table that breaks down each session clearly. It worked. And I am not turning back. Now It's Your Turn Try any one of these prompts today. See the shift. Feel the productivity. Don't just save this post - use it. Copy the prompts. Edit them to fit your focus on creativity. Because the truth is simple: Smart work doesn't replace hard work. It just makes it worth it. Bye for now. This article was published on July 29th, 2025 in learnaitoprofit.com. Written by Sejalbaranwal Get articles on AI, Productivity, important skills with a touch of personal experiences. My stories will take you in different dimension. Published in LearnAItoprofit.com Artificial Intelligence Tools For Writers. How to use Ai Tools and Generative AI for passive income for your side hustle or home business.
本文於 修改第 1 次
|
「深度求索」簡介 -- iKala
|
|
|
推薦1 |
|
|
|
DeepSeek基本資訊 DeepSeek 是什麼? DeepSeek 是一款由中國 AI 公司「深度求索」開發的大型語言模型,旨在為企業、開發者及個人用戶提供高效且具成本優勢的 AI 解決方案。DeepSeek 擅長自然語言處理,能執行對話、文本生成、翻譯、程式碼編寫等多種任務,提升學習、工作與創意內容生成的效率。 DeepSeek 的主要功能有哪些? DeepSeek 提供多樣化功能,主要包括 DeepSeek Chat 和 DeepSeek Coder。 DeepSeek Chat 是一個對話式 AI 應用,可以像真人一樣與你互動,回答問題、提供建議、和你聊天。 DeepSeek Coder 則是一個程式碼生成工具,可以幫助開發者快速編寫、測試和 debug 程式碼。 此外,DeepSeek 還具備文本生成、翻譯、摘要等多種功能,應用範圍廣泛。 DeepSeek 的應用領域有哪些? DeepSeek 應用廣泛,涵蓋以下領域: * 教育: 作為智能學習助手,提供解題與學習建議。 * 金融: 分析市場數據,支援風險評估與投資決策。 * 客服: 作為智能客服,快速回應客戶問題,提升服務效率。 * 內容創作與媒體: 支援文章撰寫、新聞摘要、社交媒體內容生成等。 DeepSeek 是否為開源模型?可以部署在本地環境嗎? DeepSeek 提供 開源版本,允許開發者自由使用、修改和優化模型,以滿足特定需求。用戶可以在 本地環境(地端)部署 DeepSeek,而無需依賴雲端服務,從而提升數據隱私與安全性。目前,開發者可透過 Ollama 等工具輕鬆安裝和運行 DeepSeek 模型,實現 離線運行,確保敏感數據不會外洩,同時降低長期運算成本。這種開源與本地部署的彈性,使 DeepSeek 成為企業及開發者在 AI 應用上更具掌控性的選擇。 DeepSeek比較與選擇 DeepSeek 與 OpenAI 在效能上有何差異?成本優勢如何? * 模型對應: ** DeepSeek V3 效能接近 OpenAI GPT-4o,特別在中文 NLP 任務上表現突出。 ** DeepSeek R1 則相當於 OpenAI o3-mini,適用於一般文本生成與推理需求。 * 開放性與部署: ** DeepSeek 採用 開源架構,可本地部署與客製化,提供更大的彈性與數據掌控權。 ** OpenAI 則提供 封閉式 API 服務,適合雲端應用,無需維護基礎設施。 * 成本考量: ** DeepSeek 開源版本 可降低長期運行成本,特別適合企業自建運算資源或需要本地部署的應用。 ** OpenAI API 提供穩定的雲端服務,但以使用量計費,長期成本可能較高。 DeepSeek 與其他 AI 模型(如 ChatGPT、Google Gemini)的差異? DeepSeek、ChatGPT 與 Google Gemini 各具特色: * DeepSeek: 在中文處理方面優化,符合中文語言習慣,並具備成本效益。 * ChatGPT: 擅長英文內容生成與創意寫作,適合開放式對話應用。 * Google Gemini: 強化知識檢索與資訊處理,適合需要龐大資料支持的場景。選擇 AI 模型時可根據需求與應用場景進行比較。 DeepSeek 隱私與安全 DeepSeek 是否安全可靠? DeepSeek 重視用戶數據的隱私與安全,採取多層次的保護措施,包括數據加密、身份驗證與漏洞檢測,並遵循相關數據保護法規。然而,與所有技術一樣,仍存在潛在風險,建議企業根據自身需求搭配額外的安全防護措施,以確保數據的完整性與安全性。 DeepSeek 的隱私政策? DeepSeek 的隱私政策強調數據使用的透明性,僅收集必要的用戶數據,如設備資訊、操作系統、IP 位址及訪問時間,主要用於系統維護與服務優化。DeepSeek 承諾不會未經授權使用個人數據,並採取嚴格的安全控制措施。然而,根據不同市場法規及應用場景,建議用戶詳細了解隱私政策內容,並定期檢視數據使用狀況,以降低潛在風險。 為何部分國家或機構對 DeepSeek 保持警戒,甚至考慮限制其使用? 隨著 DeepSeek 迅速成為全球熱門 AI 模型,部分國家對其安全性與潛在風險保持關注。例如: * 美國海軍 已基於安全考量,要求人員避免使用 DeepSeek 模型,以防機密外洩。 * 台灣數發部 也發布公告,公務機關需限制使用 DeepSeek 相關 AI 產品,以確保數據安全。 * 義大利、法國、愛爾蘭 等歐洲國家則開始對 DeepSeek 進行調查,以評估其合規性與影響力。 這些國家與企業的擔憂主要來自於: 1. 數據安全風險:若 AI 服務部署於國外伺服器,可能存在資訊外洩風險。 2. 內容生成合規性:DeepSeek 的開源特性讓第三方可修改與訓練模型,可能導致難以監管的內容產出。 3. 產業競爭影響:部分大型 AI 企業可能會受到 DeepSeek 低成本策略的衝擊,因此市場動態正在變化。 DeepSeek 技術解析 DeepSeek 的模型架構與混合專家架構(MoE)是什麼? DeepSeek 採用了 混合專家架構(Mixture of Experts, MoE),這是一種先進的深度學習技術,能夠提升模型的處理效率與準確性。MoE 的核心思想是將模型劃分為多個「專家模組」,每個模組負責處理特定類型的輸入。當模型接收到指令時,一個門控網絡(Gating Network)會根據輸入內容,自動選擇最適合的專家模組進行運算,從而實現高效能的推理與生成能力。這種架構讓 DeepSeek 在處理複雜任務時能更有效分配資源,減少不必要的計算開銷,並在多樣化應用場景下保持高效運行。 DeepSeek 如何利用強化學習(RL)來增強模型能力? DeepSeek 使用 強化學習(Reinforcement Learning, RL) 技術來進一步提升模型的準確性與適應性。例如: * RLHF, Reinforcement Learning from Human Feedback(人類回饋強化學習):透過人類標註數據來訓練 AI,使其回應更符合使用者需求。 * RLAIF, Reinforcement Learning from AI Feedback(AI 自己評估回應):利用 AI 自動評估與調整回應質量,加快學習速度。 這些技術讓 DeepSeek 能夠在語言理解、邏輯推理與生成內容上持續進化,提供更準確且高效的回應。 DeepSeek 的訓練成本和效能如何? DeepSeek 透過高效訓練方法與 MoE 架構,降低成本並提升效能。DeepSeek 的目標是讓更多人能夠使用 AI 技術,因此在成本控制方面做了很多努力。具體的訓練成本和效能數據可能因不同的模型版本和訓練數據而有所不同。 DeepSeek 整合與應用 DeepSeek是否能與其他應用程式整合? DeepSeek 提供了 API 介面,開發者可以將 DeepSeek 的功能整合到各種應用程式中。這意味著您可以將 DeepSeek 的強大 AI 能力應用於您的產品和服務中,例如將 DeepSeek Chat 整合到您的客服系統中,或將 DeepSeek Coder 整合到您的開發工具中。透過 API 整合,DeepSeek 可以為您的應用程式帶來更多價值和創新。 DeepSeek 的推理能力如何? DeepSeek 具有出色的推理能力,可以根據用戶提出的問題,進行深入的分析和理解,並給出合理的答案。DeepSeek 的推理能力得益於其強大的語言模型和先進的算法。DeepSeek 不僅可以回答簡單的問題,還可以處理複雜的邏輯推理和知識推理任務。 DeepSeek 與雲端市場 DeepSeek 在雲端市場的角色? DeepSeek 在雲端市場中扮演著多重角色,它既是強大的 AI 模型供應商,提供 DeepSeek Chat 和 DeepSeek Coder 等模型,也與多家雲端服務商建立合作關係,將模型部署在這些平台上,共同推動 AI 技術的應用。同時,DeepSeek 也以其高性能和低成本的優勢,成為市場的有力競爭者,促使整個雲端 AI 市場不斷發展和創新。 DeepSeek 的崛起對 AI 產業與雲端市場有何影響? DeepSeek 的興起正在改變 AI 產業的市場格局,主要影響包括: 1. 開源生態系統壯大:許多企業開始關注開源 AI,並投入資源進行客製化開發。 2. 新穎架構與成本優勢:DeepSeek 提供高效能但相對低成本的解決方案,使企業能夠降低 AI 服務開銷。 3. 大型 AI 企業價格競爭:隨著 DeepSeek 進入市場,OpenAI、Google、Anthropic 等 AI 大廠可能會調整價格或推出新的策略來應對競爭。 4. 政府與企業監管提升:部分國家已開始對 AI 產品進行更嚴格的審查,以確保技術的安全性與合法合規。 企業與開發者如何應對? 企業可以與 DeepSeek 或其他 AI 模型供應商合作,將其技術整合到自己的產品和服務中,提升競爭力。同時,企業應重視 AI 人才的培養和引進。開發者可以積極參與到 DeepSeek 的開源生態中,利用其提供的 API 和工具,開發出更多創新應用。此外,企業和開發者都應關注 AI 倫理與安全問題,確保 AI 技術的健康發展。 如果您對人工智慧解決方案有任何疑問或需求,iKala 作為 AI 專家,可以為您提供專業支持,協助您選擇最適合的技術,以提升企業效能與競爭力。歡迎聯繫我們!
本文於 修改第 3 次
|
人工智能:杞人憂天還是夢魘成真 -- Matt Egan
|
|
|
推薦2 |
|
|
|
The ‘godfather of AI’ reveals the only way humanity can survive superintelligent AI Matt Egan, CNN, 08/14/25 Geoffrey Hinton, known as the “godfather of AI,” fears the technology he helped build could wipe out humanity — and “tech bros” are taking the wrong approach to stop it. Hinton, a Nobel Prize-winning computer scientist and a former Google executive, has warned in the past that there is a 10% to 20% chance that AI wipes out humans. On Tuesday, he expressed doubts about how tech companies are trying to ensure humans remain “dominant” over “submissive” AI systems. “That’s not going to work. They’re going to be much smarter than us. They’re going to have all sorts of ways to get around that,” Hinton said at Ai4, an industry conference in Las Vegas. In the future, Hinton warned, AI systems might be able to control humans just as easily as an adult can bribe 3-year-old with candy. This year has already seen examples of AI systems willing to deceive, cheat and steal to achieve their goals. For example, to avoid being replaced, one AI model tried to blackmail an engineer about an affair it learned about in an email. Instead of forcing AI to submit to humans, Hinton presented an intriguing solution: building “maternal instincts” into AI models, so “they really care about people” even once the technology becomes more powerful and smarter than humans. AI systems “will very quickly develop two subgoals, if they’re smart: One is to stay alive… (and) the other subgoal is to get more control,” Hinton said. “There is good reason to believe that any kind of agentic AI will try to stay alive.” That’s why it is important to foster a sense of compassion for people, Hinton argued. At the conference, he noted that mothers have instincts and social pressure to care for their babies. “The right model is the only model we have of a more intelligent thing being controlled by a less intelligent thing, which is a mother being controlled by her baby,” Hinton said. ‘The only good outcome’ Hinton said it’s not clear to him exactly how that can be done technically but stressed it’s critical researchers work on it. “That’s the only good outcome. If it’s not going to parent me, it’s going to replace me,” he said. “These super-intelligent caring AI mothers, most of them won’t want to get rid of the maternal instinct because they don’t want us to die.” Hinton is known for his pioneering work on neural networks, which helped pave the way to today’s AI boom. In 2023, he stepped down from Google and started speaking out about the dangers of AI. Not everyone is on board with Hinton’s mother AI approach. Fei-Fei Li, known as the “godmother of AI” for her pioneering work in the field, told CNN on Wednesday that she respectfully disagrees with Hinton, her longtime friend. “I think that’s the wrong way to frame it,” Li, the co-founder and CEO of spatial intelligence startup World Labs, said during a fireside chat at Ai4. Instead, Li is calling for “human-centered AI that preserves human dignity and human agency.” “It’s our responsibility at every single level to create and use technology in the most responsible way. And at no moment, not a single human should be asked or should choose to let go of our dignity,” Li said. “Just because a tool is powerful, as a mother, as an educator and as an inventor, I really believe this is the core of how AI should be centered.” Emmett Shear, who briefly served as interim CEO of ChatGPT owner OpenAI, said he’s not surprised that some AI systems have tried to blackmail humans or bypass shutdown orders. “This keeps happening. This is not going to stop happening,” Shear, the CEO of AI alignment startup Softmax, said at the Ai4 conference. “AIs today are relatively weak, but they’re getting stronger really fast.” Shear said that rather than trying to instill human values into AI systems, a smarter approach would be to forge collaborative relationships between humans and AI. AI is accelerating faster than expected Many experts believe AIs will achieve superintelligence, also known as artificial general intelligence, or AGI, in the coming years. Hinton said he used to think it could take 30 years to 50 years to achieve AGI but now sees this moment coming sooner. “A reasonable bet is sometime between five and 20 years,” he said. While Hinton remains concerned about what could go wrong with AI, he is hopeful the technology will pave the way to medical breakthroughs. “We’re going to see radical new drugs. We are going to get much better cancer treatment than the present,” he said. For instance, he said AI will help doctors comb through and correlate the vast amounts of data produced by MRI and CT scans. However, Hinton does not believe AI will help humans achieve immortality. “I don’t believe we’ll live forever,” Hinton said. “I think living forever would be a big mistake. Do you want the world run by 200-year-old white men?” Asked if there’s anything he would have done differently in his career if he knew how fast AI would accelerate, Hinton said he regrets solely focusing on getting AI to work. “I wish I’d thought about safety issues, too,” he said. For more CNN news and newsletters create an account at CNN.com
本文於 修改第 3 次
|
人工智能與智慧財產權外一章 - Dade Hayes
|
|
|
推薦1 |
|
|
|
借機會吃川普豆腐。請參考本欄2025/06/28貼文。 Robert Thomson, CEO Of Rupert Murdoch’s News Corp, Waggishly Notes That Donald Trump Is Among Authors Hurt By “Blatant Theft” Of AI: “The Art Of The Deal Has Become The Art Of The Steal” Dade Hayes, 08/06/25 Robert Thomson, CEO of Rupert Murdoch’s News Corp., which was sued last month by Donald Trump, found a crafty way to bring Trump back into the conversation Tuesday. The feisty and puckish exec, whose signature communiqués often include alliteration and high-flown language, cited Trump in the company’s fiscal fourth quarter earnings report. Trump’s suit was prompted by an enterprise report in News Corp’s Wall Street Journal exploring ties between the president and convicted pedophile Jeffrey Epstein. Thomson’s reference cleverly avoided any direct reference to Epstein or the lawsuit, and it came after some signs of a thaw in the hot-and-cold relations between Murdoch and Trump. In the company’s earnings release and in prepared remarks on a subsequent conference call with investors, Thomson noted the irony of Trump as an IP holder being victimized by AI. That circumstance has developed even as the president has made several moves to support major tech firms in their development of AI. Last January, he hosted OpenAI chief Sam Altman, Oracle chairman Larry Ellison and others at the White House to announce Project Stargate, an initiative targeting $500 billion in AI infrastructure spending in the U.S. Publishers and media companies like News Corp have selectively made deals with AI firms in an effort to wring revenue from the incursion of the technology. They also have been willing to explore their legal options to fight against what they consider to be theft of their property, as in News Corp’s lawsuit last fall against major AI firm Perplexity. The suit claims that the Jeff Bezos-backed purveyor of chatbots and large-language models was trained in part on News Corp properties like the Journal, the New York Post and works published by its HarperCollins book division. “The AI age must cherish the value of intellectual property if we are collectively to realize our potential,” Thomson said in the earnings release. “Much is made of the competition with China, but America’s advantage is ingenuity and creativity, not bits and bytes, not watts but wit. To undermine that comparative advantage by stripping away IP rights is to vandalize our virtuosity. “Even the President of the United States is not immune to this blatant theft. The President’s books are still reporting healthy sales, but are being consumed by AI engines which profit from his thoughts by cannibalizing his concepts, thus undermining future sales of his books. Suddenly, The Art of the Deal has become The Art of the Steal.” During the call, Thompson said Trump’s books (including the still-commercial 1987 title Art of the Deal) are being victimized by the very technology the president is championing. “Is it fair that creators are having their works purloined? Is it just that the President of the United States is being ripped off?” Thomson asked. “Companies are spending tens of billions on data centers, tens of billions on chips, tens of billions on energy generation – these same companies need to spend tens of millions or more on the content crucial for their success. And they need to ensure that the content eco-system remains healthy, that there is a vast range of varied and verifiable sources, and that a deeply derivative Woke AI does not become the default pathway to digital decay. “In the meantime, we will fight to protect the intellectual property of our authors and journalists, and continue to woo and to sue companies that violate the most basic property rights.” Asked during a Q&A period with Wall Street analysts during the call about any negatives from AI summaries provided by Google and other AI players, which are starting to siphon web traffic away from publishers, Thomson said nothing material has been noted as of yet. He said the company’s approach to AI companies is a “woo-and-sue” combination. Just like computing and energy resources, AI companies need large amounts of legally cleared content to feed into their models. “AI runs on IP,” Thomson said. Best of Deadline * Everything We Know About Paramount’s ‘Regretting You’ Adaptation So Far * Everything We Know About 'My Life With The Walter Boys' Season 2 So Far * 2025 TV Series Renewals: Photo Gallery * Snap Shares Sink As Ad Platform Snafu Slows Q2 Revenue Growth; Daily Users At 469 Million * 'South Park' Takes Shot At Kristi Noem In Season 27 Episode Titled "Got A Nut" Amid Trump's DHS Embracing Parody * Donald Trump Says He'd Use National Guard Or Military To Keep 2028 Olympics Safe Sign up for Deadline's Newsletter. For the latest news, follow us on Facebook, Twitter, and Instagram.
本文於 修改第 2 次
|
GPT-5今年7月問世將改變一切 -- iswarya writes
|
|
|
推薦1 |
|
|
|
重申一次:我是AI外行;這裏只提供資訊,不替資訊內容背書。 GPT-5 Is Coming in July 2025 — And Everything Will Change iswarya writes, 07/08/25 “It’s wild watching people use ChatGPT… knowing what’s coming. — OpenAI insider Mark your calendars: July 2025. That’s when the world of AI splits into before and after. If GPT-4 shook the world, GPT-5 is poised to flip it on its axis. This isn’t just another upgrade. This is a paradigm shift. A leap from incredible to unimaginable. And it’s arriving much sooner than most experts predicted. Why This Timeline Matters OpenAI doesn’t release new models gradually. Remember GPT-4? Silence, silence, then boom — the world changed overnight. In February 2024, Sam Altman said GPT-5 would follow 4.5 “in months, not years.” If you do the math, that puts us squarely in Summer 2025. And the chatter inside OpenAI supports that timeline. So what can we expect? GPT-5 Will Redefine Intelligence Let’s start with a bold claim: GPT-5 will make GPT-4 look like a pocket calculator next to a quantum computer. This model won’t just answer your questions. It will reason, listen, see, code, create, and most importantly — act. Here’s what’s coming: Enhanced Reasoning * GPT-4.5 introduced reasoning traces — chains of thought, like logic breadcrumbs. * GPT-5 shortens these traces while improving the quality, meaning it’s getting better at thinking on its own. Coding Mastery * OpenAI engineers now prefer their own tools over everything else. That’s a strong signal. * Benchmarks suggest GPT-5 could solve nearly all common coding problems, making full-stack AI developers a reality. Massively Reduced Hallucinations * GPT-3 hallucinated at ~30%. GPT-5 is expected to drop below 15%. * This is a solvable engineering problem, and OpenAI is solving it. Multimodal Everything: From Text to Touch GPT-5 won’t be text-only. It’s expected to be the first true “everything-to-everything” model: * Real-time audio understanding (and talking back) * High-fidelity image recognition and generation * Video understanding — and possibly video generation * Voice output so real it’ll make Siri sound like a 90s toy
We are entering an era where AI doesn’t just understand language — it understands reality. The Numbers (And Why They Matter) GPT-4 runs on ~1.5 trillion parameters. Rumors around GPT-5 suggest a 1 quadrillion parameter model. While that may be a stretch, a 5–50 trillion range is plausible. But here’s the real kicker: “The era of scaling just by parameters is over.” — Sam Altman Instead of just bigger, GPT-5 will be smarter — better architecture, more efficiency, and improved memory. Agents: Your New Digital Coworkers We’re not just getting better models. We’re getting autonomous agents. AI tools that: * Manage workflows * Use real software (think Excel, Canva, Jira) * Perform research while you sleep * Write, test, and deploy full applications Think of it as having a team of tireless digital workers — available 24/7, never tired, and always learning. The total addressable market? Impossible to calculate. Every human who works on a laptop is in scope. The Leaks Are Staggering If internal OpenAI benchmarks are to be believed: * MMLU (massive multitask language understanding): Up to 95% accuracy * SWE Bench (software engineering tasks): Up from 32% to 85% * Advanced math: Cracking problems that stump PhDs * Multimodal understanding: 90%+ success across vision and text challenges At this level, we’re not just competing with humans. We’re surpassing them. Experts Are Wrong About the Timeline Major think tanks like McKinsey, Brookings, and MIT are underestimating the pace. * They say AI agents go mainstream by 2027. * Reality: We’re already building hybrid AI-human teams in 2025. * They say multi-agent systems emerge in 2027. * Reality: Open-source multi-agent frameworks are already live. * They say 30% of knowledge work will be automated by 2027. Reality: We’ll likely hit that by the end of 2025. The automation cliff isn’t coming. We’re already falling off it.
Superintelligence: Not “If,” But “When” Researchers have now solved the scaling laws — the mathematical relationship between compute, data, and performance. We now know how much is needed to reach artificial superintelligence (ASI). And that realization has triggered a new wave of urgency. Former OpenAI co-founder Ilya Sutskever even left to start a company focused entirely on ASI. When the people who built GPT-4 drop everything to chase superintelligence, it’s time to pay attention. Three Forces Driving the AI Explosion 1. Scaling laws — Predictable pathways to ASI via compute + data 2. Inference time scaling — Longer thinking = exponentially better results 3. Distillation — AI teaching AI, creating smarter models than the original These form a feedback loop, where AI builds better AI, which builds better AI… The First Movers Will Win At First Movers AI, we’re not waiting for permission. We’re building the next generation of AI-powered companies — now. The window to adapt is narrow. The upside is enormous. * Your job? Learn to collaborate with AI. * Your edge? Move faster than the crowd. * Your future? Shaped by the decisions you make this year. Final Thoughts GPT-5 won’t just be a better chatbot. It will: * Reason better than human analysts * Write and code at elite levels * See, hear, and speak across modalities * Act as an autonomous digital worker * And possibly, outperform humans in most mental tasks The transformation isn’t coming. It’s here. And GPT-5 is just the beginning. Written by iswarya writes Iswarya | Growth, Cooking, Tech Enthusiast Published in Predict where the future is written
本文於 修改第 1 次
|
人工智能的瓶頸在於全新數據來源 -- Jack Morris
|
|
|
推薦1 |
|
|
|
這位摩里斯先生應該是康奈爾大學的博士生。下文的中文解讀請參考:《演算法不重要,AI的下一個範式突破,「解鎖」新資料來源才是》;我只負責提供資訊,不背書該「解讀」是否正確。 There are no new ideas in AI — only new datasets All four big breakthroughs in LLMs happened because we unlocked a new source of data. What will be the next one? Jack Morris, 07/05/25 Most people know that AI has made unbelievable progress over the last 15 years — especially in the last five. It might feel like that progress is inevitable — although large paradigm-shift-level breakthroughs are uncommon, we march on anyway through a stream of slow and steady progress. In fact, some researchers have recently declared a “Moore’s Law for AI,” where the computer’s ability to do certain things (in this case, certain types of coding tasks) increases exponentially with time: Although I don’t really agree with this specific framing for a number of reasons, I can’t deny the trend of progress. Every year, our AIs get a little bit smarter, a little bit faster, and a little bit cheaper, with no end in sight. Most people think that this continuous improvement comes from a steady supply of ideas from the research community across academia — mostly MIT, Stanford, CMU — and industry — mostly Meta, Google, and a handful of Chinese labs, with lots of research done at other places that we’ll never get to learn about. And we certainly have made a lot of progress due to research, especially on the systems side of things. This is how we’ve made models cheaper, in particular. Let me cherry-pick a few notable examples from the last couple years: * In 2022, Stanford researchers gave us FlashAttention, a better way to utilize memory in language models that’s used literally everywhere. * In 2023, Google researchers developed speculative decoding, which all model providers use to speed up inference (also developed at DeepMind, I believe concurrently?). * In 2024, a ragtag group of internet fanatics developed Muon, which seems to be a better optimizer than SGD or Adam and may end up as the way we train language models in the future. * In 2025, DeepSeek released DeepSeek-R1, an open-source model that has equivalent reasoning power to similar closed-source models from AI labs (specifically Google and OpenAI). So, we’re definitely figuring stuff out. And the reality is actually cooler than that: We’re engaged in a decentralized globalized exercise of Science — where findings are shared openly on arXiv and at conferences and on social media — and every month we’re getting incrementally smarter. If we’re doing so much important research, why do some argue that progress is slowing down? People are still complaining. The two most recent huge models, Grok 3 and GPT-4.5, only obtained a marginal improvement on capabilities of their predecessors. In one particularly salient example, when language models were evaluated on the latest math olympiad exam, they scored only 5%, indicating that recent announcements may have been overblown when reporting system ability. And if we try to chronicle the big breakthroughs, the real paradigm shifts, they seem to be happening at a different rate. Let me go through a few that come to mind. LLMs in four breakthroughs 1. Deep neural networks: Deep neural networks first took off after the AlexNet model won an image recognition competition in 2012. 2. Transformers + LLMs: In 2017, Google proposed transformers in Attention Is All You Need, which led to BERT (Google, 2018) and the original GPT (OpenAI, 2018). 3. RLHF: This was first proposed (to my knowledge) in the InstructGPT paper from OpenAI in 2022. 4. Reasoning: In 2024, OpenAI released O1, which led to DeepSeek R1. If you squint just a little, these four things (DNNs → Transformer LMs → RLHF → Reasoning) summarize everything that’s happened in AI. We had DNNs (mostly image recognition systems), then we had text classifiers, then we had chatbots, and now we have reasoning models (whatever those are). Say we want to make a fifth such breakthrough; it could help to study the four cases we have here. What new research ideas led to these groundbreaking events? It’s not crazy to argue that all the underlying mechanisms of these breakthroughs existed in the 1990s, if not before. We’re applying relatively simple neural network architectures and doing either supervised learning (1 and 2) or reinforcement learning (3 and 4). Supervised learning via cross-entropy, the main way we pre-train language models, emerged from Claude Shannon’s work in the 1940s. Reinforcement learning, the main way we post-train language models via RLHF and reasoning training, is slightly newer. It can be traced to the introduction of policy-gradient methods in 1992 (and these ideas were certainly around for the first edition of the Sutton & Barto “Reinforcement Learning” textbook in 1998). If our ideas aren’t new, then what is? OK, let’s agree for now that these “major breakthroughs” were arguably fresh applications of things that we’d known for a while. First of all, this tells us something about the next major breakthrough (that “secret fifth thing” I mentioned above): Our breakthrough is probably not going to come from a completely new idea — rather, it’ll be the resurfacing of something we’ve known for a while. But there’s a missing piece here: Each of these four breakthroughs enabled us to learn from a new data source: 1. AlexNet and its follow-ups unlocked ImageNet, a large database of class-labeled images that drove 15 years of progress in computer vision. 2. Transformers unlocked training on “The Internet” and a race to download, categorize, and parse all the text on the web (which it seems we’ve mostly done by now). 3. RLHF allowed us to learn from human labels indicating what “good text” is (mostly a vibes thing). 4. Reasoning seems to let us learn from “verifiers,” things like calculators and compilers that can evaluate the outputs of language models. Remind yourself that each of these milestones marks the first time the respective data source (ImageNet, the web, humans, verifiers) was used at scale. Each milestone was followed by a frenzy of activity: Researchers compete to (a) siphon up the remaining useful data from any and all available sources and (b) make better use of the data we have through new tricks to make our systems more efficient and less data-hungry. (I expect we’ll see this trend in reasoning models throughout 2025 and 2026 as researchers compete to find, categorize, and verify everything that might be verified.) How much do new ideas matter? There’s something to be said for the fact that our actual technical innovations may not make a huge difference in these cases. Examine the counterfactual. If we hadn’t invented AlexNet, maybe another architecture would have come along that could handle ImageNet. If we never discovered transformers, perhaps we would’ve settled with LSTMs or SSMs or found something else entirely to learn from the mass of useful training data we have available on the web. This jibes with the theory some people have that nothing matters but data. Some researchers have observed that, for all the training techniques, modeling tricks, and hyperparameter tweaks we make, the thing that makes the biggest difference by-and-large is changing the data. As one salient example, some researchers worked on developing a new BERT-like model using an architecture other than transformers. They spent a year or so tweaking the architecture in hundreds of different ways and managed to produce a different type of model (this is a state-space model or “SSM”) that performed about equivalently to the original transformer when trained on the same data. This discovered equivalence is really profound because it hints that there is an upper bound to what we might learn from a given dataset. All the training tricks and model upgrades in the world won’t get around the cold hard fact that there is only so much you can learn from a given dataset. And maybe this apathy to new ideas is what we were supposed to take away from The Bitter Lesson. If data is the only thing that matters, why are 95% of people working on new methods? Where will our next paradigm shift come from? (YouTube…maybe?) The obvious takeaway is that our next paradigm shift isn’t going to come from an improvement to RL or a fancy new type of neural net. It’s going to come when we unlock a source of data that we haven’t accessed before or haven’t properly harnessed yet. One obvious source of information that a lot of people are working towards harnessing is video. According to a random site on the web, about 500 hours of video footage are uploaded to YouTube per minute. This is a ridiculous amount of data, much more than is available as text on the entire internet. It’s potentially a much richer source of information too, as videos contain not just words, but the inflection behind them, as well as rich information about physics and culture that just can’t be gleaned from text. It’s safe to say that as soon as our models get efficient enough or our computers grow beefy enough, Google is going to start training models on YouTube. They own the thing, after all; it would be silly not to use the data to their advantage. A final contender for the next “big paradigm” in AI is a data-gathering systems that is some way embodied — or, in the words of a regular person, robots. We’re currently not able to gather and process information from cameras and sensors in a way that’s amenable to training large models on GPUs. If we could build smarter sensors or scale our computers up until they can handle the massive influx of data from a robot with ease, we might be able to use this data in a beneficial way. It’s hard to say whether YouTube or robots or something else will be the Next Big Thing for AI. We seem pretty deeply entrenched in the camp of language models right now, but we also seem to be running out of language data pretty quickly. But if we want to make progress in AI, maybe we should stop looking for new ideas and start looking for new data. Subscribe to Freethink on Substack for free Get our favorite new stories right to your inbox every week Subscribe now
本文於 修改第 1 次
|
人工智能工具會把使用者變蠢? -- Lucas Ropek
|
|
|
推薦1 |
|
|
|
Multiple Studies Now Suggest That AI Will Make Us Morons Are we on the road to Idiocracy? Lucas Ropek, 06/26/25 For the second time in two weeks, a study has been published that suggests that people who use AI regularly may display significantly less cognitive ability than those who don’t rely on it. The studies have bolstered critics’ accusations that AI makes you stupid. The most recent study was conducted by the University of Pennsylvania’s Wharton School and looked at a sample size of over 4,500 participants. The study, which looked at the cognitive differences between people who used LLMs like ChatGPT to do research and those who used Google Search, found that the people who used chatbots tended “to develop shallower knowledge” of the subjects they were researching. Both groups were asked to research how to start a vegetable garden, with some participants randomly selected to use AI, while others were asked to use a search engine. According to the study’s findings, those who used ChatGPT gave much worse advice about how to plant a vegetable garden than those who used the search engine. Researchers write: The shallower knowledge accrues from an inherent feature of LLMs—the presentation of results as syntheses of information rather than individual search links—which makes learning more passive than in standard web search, where users actively discover and synthesize information sources themselves. In turn, when subsequently forming advice on the topic based on what they learned, those who learned from LLM syntheses (vs. standard search results) feel less invested in forming their advice and, more importantly, create advice that is sparser, less original—and ultimately less likely to be adopted by recipients. The study concludes that this occurred ironically because of ChatGPT’s advertised benefit—”sparing users the need to browse through results and synthesize information themselves.” Because researchers did not have to hunt for information themselves, their “depth of knowledge” was markedly lower than those who did. “In this sense, one might view learning through LLMs rather than web search as analogous to being shown the solution to a math problem rather than trying to solve it oneself,” the research concludes. The UPenn study follows on the heels of research produced by MIT, published earlier this month, that showed a similarly problematic cognitive impact produced by AI. That study, which observed the neural activity of college students who were using ChatGPT to study, found that increased AI use resulted in reduced brain activity, or what the researchers termed “cognitive debt.” The study used an EEG machine to measure the neural activity of three different groups of students—one that used ChatGPT to study, one that used Google Search, and one that used neither. The study showed that ChatGPT users displayed markedly less cognitive activity than even those who were using Google Search to find information. The methodology of the MIT article has since been called into question by AI enthusiasts. Critics have noted that the study in question was not peer reviewed and that a small sample size of participants makes it hardly exhaustive. Similarly, critics have argued that while the EEG measurements show certain decreases in specific forms of brain activity, that doesn’t necessarily mean that participants are “dumber” as a result. Indeed, less mental exertion (and, thus, less activity) can be a sign that a person is actually more competent at a task and doesn’t have to expend as much energy as a result. From a certain perspective, these recent assessments of AI’s cerebral impact reek of a moral panic about a new and not altogether well-understood phenomenon. On the other hand, the conclusion that using an app to complete a homework assignment makes you less capable of thinking for yourself would appear to be self-evident. Outsourcing mental duties to a software program means you’re not performing those duties yourself, and, as is pretty well established, doing something yourself is often the best way to learn. Of course, the internet has been curtailing human mental activity since it first went online. When was the last time you had to remember how to get somewhere? It really seems like Google Maps collectively robbed us of that ability over a decade ago. Other evidence for AI’s stupidification effect is even more obvious: the maelstrom of cheating that’s been happening in America’s educational system means that students are making their way through high school and college without learning how to write an essay or interpret a book. While there is clearly still a lot to learn about how AI impacts us, some of its side-effects seem obvious. If a student can’t write an essay without the help of a chatbot, they probably don’t have a particularly bright academic future ahead of them.
本文於 修改第 1 次
|
|
|