“Masha interned for my team over the summer of 2017. She displayed skill in quickly learning how to interface with our C++/ROS code base to implement a new module and associated features. She is a great addition to a team, very patient and easy to work with. Her controls and machine learning background is strong.”
Sign in to view Masha’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
San Francisco Bay Area
Contact Info
Sign in to view Masha’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
1K followers
500+ connections
Sign in to view Masha’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Masha
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Masha
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Sign in to view Masha’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Activity
Sign in to view Masha’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
-
Proud academic mother moment! 🎓 My PhD student Priyanka Rao successfully defended her thesis yesterday! The exam went smoothly, and she delivered an…
Proud academic mother moment! 🎓 My PhD student Priyanka Rao successfully defended her thesis yesterday! The exam went smoothly, and she delivered an…
Liked by Masha Itkina
-
Thanks to everyone involved in this video, and to Tim for being such an excellent spokesperson for Robotics at U of T!
Thanks to everyone involved in this video, and to Tim for being such an excellent spokesperson for Robotics at U of T!
Liked by Masha Itkina
-
I am hiring multiple PhD students this year to launch the Princeton Robot Planning and Learning lab: https://lnkd.in/eRGufxXQ Thanks for helping me…
I am hiring multiple PhD students this year to launch the Princeton Robot Planning and Learning lab: https://lnkd.in/eRGufxXQ Thanks for helping me…
Liked by Masha Itkina
Experience & Education
-
Toyota Research Institute
******** ********* (******* ********)
-
*****
******* ******** ******
-
******* ********** ******* ******
******** *** *********** ******
-
******** **********
****** ** ********** - *** *********** *** ************
-
-
******** **********
******'* *********** *** ************ (***********)
-
View Masha’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Recommendations received
1 person has recommended Masha
Join now to viewMore activity by Masha
-
Looking forward to speaking at the SAFE-ROL workshop! Come chat if you will be at CoRL. Excited to talk about our recent work and all things…
Looking forward to speaking at the SAFE-ROL workshop! Come chat if you will be at CoRL. Excited to talk about our recent work and all things…
Shared by Masha Itkina
View Masha’s full profile
Sign in
Stay updated on your professional world
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Other similar profiles
-
Shivani Thakkar
Seattle, WAConnect -
Ritwick Chaudhry
San Francisco Bay AreaConnect -
Prath Kini
San Francisco, CAConnect -
Yingwei Li
Cupertino, CAConnect -
Gaurav M.
Redmond, WAConnect -
Rosa Morales, PhD
Materials Scientist
Livermore, CAConnect -
Prachi Gupta
San Francisco, CAConnect -
Jay Shah
Tempe, AZConnect -
Narges Norouzi
Ph.D., Faculty at EECS Berkeley
Berkeley, CAConnect -
Sujoy Paul
IndiaConnect -
Shubham Garg
Pittsburgh, PAConnect -
Jiajun Wu
Stanford, CAConnect -
Priyam Parashar
Pittsburgh, PAConnect -
Zi Wang
Cambridge, MAConnect -
Apurva B.
San Francisco, CAConnect -
Mounika Vanka, PMP
Atlanta, GAConnect -
Amirata Ghorbani, PhD
Palo Alto, CAConnect -
Mo Sarwat
San Francisco, CAConnect -
Jiaxuan You
Incoming Assistant Professor @ UIUC CS | Sr. Research Scientist @ NVIDIA
Palo Alto, CAConnect -
Abhinav Shrivastava
Associate Professor @ University of Maryland College Park
Washington DC-Baltimore AreaConnect
Explore more posts
-
Wayne Radinsky
"Are large language models superhuman chemists?" So what these researchers did was make a test -- a benchmark. They made a test of 7,059 chemistry questions, spanning the gamut of chemistry: computational chemistry, physical chemistry, materials science, macromolecular chemistry, electrochemistry, organic chemistry, general chemistry, analytical chemistry, chemical safety, and toxicology. They recruited 41 chemistry experts to carefully validate their test. They devised the test such that it could be evaluated in a completely automated manner. This meant relying on multiple-choice questions rather than open-ended questions more than they wanted to. The test has 6,202 multiple-choice questions and 857 open-ended questions (88% multiple-choice). The open-ended questions had to have parsers written to find numerical answers in the output in order to test them in an automated manner. In addition, they ask the models to say how confident they are in their answers. Before I tell you the ranking, the researchers write: "On the one hand, our findings underline the impressive capabilities of LLMs in the chemical sciences: Leading models outperform domain experts in specific chemistry questions on many topics. On the other hand, there are still striking limitations. For very relevant topics the answers models provide are wrong. On top of that, many models are not able to reliably estimate their own limitations. Yet, the success of the models in our evaluations perhaps also reveals more about the limitations of the exams we use to evaluate models -- and chemistry -- than about the models themselves. For instance, while models perform well on many textbook questions, they struggle with questions that require some more reasoning. Given that the models outperformed the average human in our study, we need to rethink how we teach and examine chemistry. Critical reasoning is increasingly essential, and rote solving of problems or memorization of facts is a domain in which LLMs will continue to outperform humans." "Our findings also highlight the nuanced trade-off between breadth and depth of evaluation frameworks. The analysis of model performance on different topics shows that models' performance varies widely across the subfields they are tested on. However, even within a topic, the performance of models can vary widely depending on the type of question and the reasoning required to answer it." And with that, I'll tell you the rankings. You can log in to their website at ChemBench.org and see the leaderboard any time for the latest rankings. At this moment I am seeing: gpt-4: 0.48 claude2: 0.29 GPT-3.5-Turbo: 0.26 gemini-pro: 0.25 mistral_8x7b: 0.24 text-davinci-003: 0.18 Perplexity 7B Chat: 0.18 galactica_120b: 0.15 Perplexity 7B online: 0.1 fb-llama-70b-chat: 0.05
3 -
Ajay Jaiswal
☎☎☎ WeLore for LLM Compression & PEFT ☎ ☎ ☎ Sharing our recent new work rich with interesting insights: From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients. Paper Link: https://lnkd.in/eSDUiEdc Code: https://lnkd.in/eViVgCjV Model Checkpoints: https://lnkd.in/euU5GFDw Blogpost: https://lnkd.in/efwGzn4t TL;DR; 1. We establish a consequential relationship between low rank gradient dynamics during pretraining and the emergence of low-rank structures in weight matrices. 2. Low-rank structures in weights emerge non-uniformly across different layers and types of matrices in LLMs - and that is rooted in the inherent optimization difficulty of LLMs. 3. This understanding leads to WeLore, a simple yet effective algorithm that simultaneously addresses LLM weight compression and parameter-efficient fine-tuning (PEFT) in one unified approach. And it works amazingly well in practice! Shiwei Liu Zhenyu (Allen) Zhang Atlas Wang Yuandong Tian #LuYin Jiawei Zhao
482 Comments -
Cierra Choucair
🦗 Relatively quiet day on the quantum front — but there is quantum excitement to be had in Berkeley Lab’s process for identifying and fabricating quantum materials. Plus, if there were QIS standards for secondary schooling, these would be it. ⚆ Researchers at Berkeley Lab have combined rapid computational predictions with precise fabrication techniques to accelerate the discovery of quantum materials; the results are shared in a publicly available database called the Quantum Defect Genome ⚆ IBM has become a partner in the Superconducting Quantum Materials and Systems Center at Fermilab, focusing on advancing superconducting quantum computing technology through tackling challenges in quantum computing, communication, large-scale deployment, and the workforce ⚆ Quandela and Welinq have teamed up to combine their expertise in quantum processors and interconnect technology to create custom quantum links for photonic quantum computers ⚆ attocube systems Inc is providing essential tools for the quantum technology supply chain to improve energy efficiency for scalable quantum computing applications ⚆ The case for integrating quantum informatics into secondary school curricula is made by positioning it within traditional computer science education, aligning it with the Great Principles of Computing, and establishing standards for expected student outcomes ⚆ Don't miss a single qubit -- follow me & subscribe to the free newsletter for your daily quantum computing news & research insights. hashtag #quantumcomputing #quantumtechnology #innovation #quantumscience #quantumphysics
3 -
Mrityunjay Kumar
As part of my PhD research, I did long interviews with many engineers from SaaS product companies to understand their onboarding journey, esp. how they ramped up on the product. In many cases, I realized that seemingly small things had an outsized positive impact on their onboarding journey (they seemed to have learned better, faster and more). As I get ready to work on a paper to formally report the results, I thought I will share some of the takeaways which may be useful for new campus hires who are going to join a company in a few weeks. Three seemingly small actions that impacted onboarding significantly: * 𝐀𝐬𝐤𝐢𝐧𝐠 𝐪𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬: Mostly, new campus hires do not ask questions. Some of them said that is because they are used to not asking questions in class, and onboarding feels like a classroom! However, in many instances, the new hire started asking questions at some point, and their experience changed significantly, for better. There were external situations which caused them to start asking questions (those are stories for a different day, very fascinating!), but they benefitted a lot when they started asking questions. * 𝐓𝐚𝐤𝐢𝐧𝐠 𝐧𝐨𝐭𝐞𝐬 𝐩𝐮𝐫𝐩𝐨𝐬𝐞𝐟𝐮𝐥𝐥𝐲: Not many people reported taking notes. I found there were a few who took notes with specific goals - one of them wanted to capture what they didn’t understand, another took down everything so they could revise within 24 hours, still others had a mental model of the product where they filled the gap of understanding through these notes. People who had specific goals when taking notes seemed to have had much better experience of onboarding than those who didn't take notes or took them without clear purpose. * 𝐓𝐞𝐬𝐭𝐢𝐧𝐠 𝐭𝐡𝐞 𝐩𝐫𝐨𝐝𝐮𝐜𝐭: Some people were assigned the task to read and execute test cases, others ha mentors telling them to do so. Either way, those who engaged in this activity reported much better experience with onboarding (they understood the broad picture of the product better) than those who did not. If you are joining the workforce soon as a new campus hire engineer, do keep these in mind and try them out during your onboarding journey. Do stay proactive and ask for help (for ex, if your mentor is not responsive and so your questions are unanswered), you will be surprised at how supportive onboarding teams generally are! If you have specific questions on how to do well in your first job out of college, feel free to connect and discuss. #newcampushires #firstjob #careeradvice
379 Comments -
Rabab Alomairy, Ph.D.
Are you facing hurdles in scaling up your computational tasks or fine-tuning deep learning models on distributed memory systems with GPU acceleration? Join us at JuliaCon 2024 for the Dagger.jl Workshop! Discover how Dagger's parallel task-based runtime can empower you to efficiently parallelize across threads, nodes, and GPUs. Secure your spot now! 🌟 JuliaCon 2024 #JuliaLang #JuliaCon2024 #ParallelComputing #Daggerjl #GPUComputing
11 -
Daksh Shami
Ever had one of those research rabbit holes where you lose track of time, fueled by coffee and a nagging question? For months now, I've been captivated by a simple yet profound idea: Could the symmetries we see in nature - and how they change - be the key to unlocking the puzzles of quantum computing? This question led me deep into the weeds of group theory and quantum circuit complexity. I won't lie - there were days when I felt completely lost, staring at equations that seemed to mock me from the whiteboard. But then came the breakthroughs. Like when I first grasped how "character complexity" - this new tool I've been developing - could actually shed light on the hidden workings of quantum algorithms. In my latest arXiv preprint, "Character Complexity: A Group-Theoretic Approach to Quantum Circuit Complexity," I dig into this new perspective. Here's what you'll find: * Introducing "Character Complexity": A new way to measure quantum circuit complexity using group representation theory. * Unveiling Hidden Structure: Some surprising findings on how character complexity relates to classical simulability. * Visualizing Complexity: Using the Bloch hypersphere to actually see these complex quantum transformations. I'm genuinely excited about where this research might lead us in understanding quantum algorithms. But I'm even more eager to hear what you think. Check out the full paper here: https://lnkd.in/gEmdGx3q Let's discuss: - Where do you see this approach being useful? - What new questions does it raise for you? Looking forward to diving into these quantum mysteries with you all. #QuantumComputing #GroupTheory #ComplexityTheory #Research
19 -
Joseph Martinez
How can LLMs aid in the process of building Simulation models? Some weeks ago, my (first) Journal article was published! There, we examined the potential of prompting ChatGPT, to generate functional Simulation model code from a prose-based narrative. Please check it out! #LLMs #AI #GenAI
81 Comment -
Danny To Eun Kim
🔍 Ever wondered if there's a unified framework for Retrieval-Augmented Generation (RAG)? We've formalized the retrieval enhancement paradigm with consistent notations, synthesizing research across all machine learning domains. Check out our new preprint: https://lnkd.in/e5iJv99z 📍Retrieval augmentation isn't just for language models! It can be generalized across machine learning, extending its impact far beyond language generation to areas like speech processing, computer vision, time series prediction, and computational biology. 🌟 We introduce Retrieval-Enhanced Machine Learning (REML), a formal framework that integrates retrieval mechanisms into machine learning models to enhance their performance. Our key contributions: 1️⃣ Synthesis and Unification We've synthesized existing research and unified it into a coherent framework, providing consistent notations that bridge various ML domains. 2️⃣ Modularization of REML Retrieval-Enhanced Machine Learning (REML) is modularized into distinct components: Querying, Searching, Presentation, Consumption, Storing, Optimization, and Evaluation. This modular approach makes it easier to understand and implement. 3️⃣ Enriching with IR Research Information Retrieval (IR) research is often overlooked by NLP researchers. Each modularized part of REML is enriched by the rich history and foundational principles of IR research. 4️⃣ Proposing Future Research By synthesizing and unifying existing work, we identified numerous gaps and areas for improvement. We propose numerous future research opportunities in this emerging field of REML. Incredible work with Alireza Salemi, Andrew Drozdov, Fernando Diaz, and Hamed Zamani
68 -
Vamsi Krishna Puli, Ph.D.
Happy to share that our latest paper is out! In this study, we delve into the world of Slow Feature Analysis, a powerful technique to transform measured data into uncorrelated signals ranging from slow to fast. We've introduced a novel approach that goes beyond traditional methods, addressing the challenge of nonstationary and oscillating features. Our semi-supervised encoder-decoder architecture incorporates a statistical preference for these characteristics, paving the way for more accurate modeling. Curious about the results? We put our approach to the test on both simulated and real industrial processes, with promising outcomes. Thanks to my supervisor, Prof. Biao Huang. Read more about our findings in the full paper! https://lnkd.in/gK6naG5U
612 Comments -
Alden Keefe Sampson
This paper is inspired! Much of the current AI weather model research is focused on more accurate and faster versions of what we already have. That's important, but the most exciting advances will come from asking "what *new capabilities* can AI models provide?" I've heard the phrase "AI models will allow us to ask new science questions" before. These folks actually did it and found something fascinating. [Side note: the approach they use find a more accurate initial state is the same one we use to assimilate near real time streamflow observations into HydroForecast at Upstream Tech. Cool!]
353 Comments -
Aniket Mishrikotkar
It's hard to run inference for llms because of their large memory footprint because of all model parameters and activations required at inference time and parallelization is hard because of the autoregressive nature of generating tokens. So to optimize inference in general you need to lower the memory footprint (use fewer GPUs), compute(FLOPS) and latency. There are some approaches to make inference efficient(less memory usage and fast): 1. Apply parallelism techniques to run inference across a large number of GPUs 2. Offloading unused data to the CPU and reading it back when needed (less memory usage but high latency) 3. Smart batching strategies like continuous batching (vllm uses this strategy) 4. Compression techniques like quantization, distillation or pruning 5. Architectural optimizations like using PagedAttention (efficiently manages key and value in memory)
14 -
Aishwarya Naresh Reganti
😯🤯 It's super fascinating how LLMs resemble humans in so many aspects. Ever heard of the "lost-in-the-middle" issue with LLMs? 📏 As long contexts become more common, it's really important for folks working in the space to be aware of this issue with LLMs. 💡 A really popular paper from November 2023, titled "Lost in the Middle: How Language Models Use Long Contexts," empirically demonstrates that LLM performance varies significantly depending on the position of relevant information within the context. 📖 Some Insights: ⛳ The paper analyzes LLM performance on tasks requiring identification of relevant information within input contexts, including multi-document question answering and key-value retrieval. ⛳The performance significantly varies based on the position of relevant information within the context. ⛳LLMs tend to perform best when relevant information is at the beginning or end of the context, while performance degrades when information is in the middle. ⛳ A distinctive U-shaped performance curve is observed. Performance peaks when relevant information is at the start or end of the context and declines significantly when it's in the middle. 🤯 The U shaped performance curve is so similar to human attention spans during a typical presentation. 🤯I found the human attention chart from renown professor and university researcher J.W. Niemantsverdriet's research which suggests that the attention of a typical human audience doesn’t maintain a consistent level throughout presentations. It peaks in the beginning and end (which we can all agree on 😁 ) It's truly intriguing how LLMs often resemble humans in many ways. Typically, they reflect human traits because they're trained on human-generated data. However, this "lost-in-the-middle" resemblance in this case is just 🤯 🤯 . 🚨 I post #genai content daily, follow along for the latest updates! #genai #llms #contextlength
822 Comments -
Michelle Yi
Really enjoyed reading this paper, which won an outstanding paper award at #ICLR2024, on the fair comparison of long-sequence models. Researchers show here that self-supervised pretraining, even just on the target task's data, leads to dramatically better performance for Transformers and other architectures on long sequence modeling tasks. This contradicts prior work suggesting Transformers are inherently limited in this area. With pretraining, a vanilla Transformer matches or exceeds the performance of specialized architectures proposed for long sequences. The findings highlight the importance of pretraining in evaluating model capabilities, as training from scratch can grossly underestimate performance. The pretraining is especially beneficial when task data is limited. I think we always have to question whether we are evaluating AI models appropriately. The dramatic performance improvements from self-supervised pretraining suggest that many models may be far more capable than previous benchmarks have indicated. We may need to rethink how we assess and compare different architectures as we make advances in this space. Evaluation is harder than you think! Arxiv link: https://lnkd.in/ghTYsd_K ICLR link: https://lnkd.in/gZTkpSXM #ArtificialIntelligence #research #pretraining #foundationmodels #llms #data
14 -
Sebastian Sartor
🚀 Exciting News from the Frontiers of AI and Robotics! 🚀 I’m thrilled to finally share what I’ve been working on these past few months and announce the preprint of my latest paper, "Neural Scaling Laws for Embodied AI," submitted to NeurIPS! Our study is the first to quantify the scaling laws for Robot Foundation Models (RFMs) and the integration of Large Language Models (LLMs) into robotics. Through a meta-analysis of 198 research papers, we’ve explored how compute, model size, and training data impact performance across various robotic tasks. We discovered that, similar to advancements in language and vision, performance in RFMs and LLMs improves predictably with increased resources. The power law coefficients for RFMs closely align with those found in computer vision and surpass those for LLMs in language tasks, highlighting the efficiency of scaling in robotics. Additionally, as models scale, new capabilities emerge, particularly related to data and model size, pushing the boundaries of what’s possible in robotics. These findings are significant as they bring us closer to the long-held dream of general-purpose robotics. The scaling principles that have enhanced LLMs also apply to robotics, suggesting that we might soon experience a “ChatGPT moment” for robotics! If you’re interested in the paper, you can access it here: https://lnkd.in/e5aDcGGZ A big thank you to Neil Thompson for the excellent mentorship and Prof. Joachim Henkel for co-supervising this project! If you’re interested in neural scaling laws or embodied AI/robotics, check out my previous posts for more context. (Scaling Laws: https://lnkd.in/eQWbQeVQ; embodied AI: https://lnkd.in/ebNyv925) Also feel free to reach out with any questions or if I can assist in any way.
678 Comments -
Julian Samaroo
Are you running large scale numerical computations or calibrating your deep learning models, but are struggling to use multithreading, MPI, or write GPU kernels? Then come join us for the Dagger.jl Workshop, led by Julian Samaroo and Przemysław Szufel (and assisted by Rabab Alomairy, Ph.D.), at JuliaCon 2024! Attendees will learn how to best use Dagger to scale their codes to multiple threads, multiple nodes, and multiple GPUs, all in one session! We'll cover parallelizing array, table, and graph operations, and we'll show you how you can "roll your own" parallelism with Dagger's powerful task and datadeps abstractions. Make sure to register soon so that you too can scale code to the max! https://juliacon.org/2024/ JuliaLang Eindhoven #julia #julialang #hpc #gpu #distributedcomputing #deeplearning #simulations
681 Comment -
Justin Hodges, PhD
🤔Want to hide like a fish? - Stay in a school! 🐠🐟 🐠 Ji Zhou et. al published a cool new paper that shows how #fishschools expertly cancel out noise by alternating tail beats to be surprisingly stealthy while ableing to save energy! #HighfidelityCFD. Insights into underwater vehicle design. #AUVs . Pictured: the noisiest pattern 👿 “What’s quieter than a fish? A school of them.” blog: https://lnkd.in/evP3xZdJ Reference: Ji Zhou, Jung Hee Seo, and Rajat Mittal. "Effect of schooling on flow generated sounds from carangiform swimmers." Bioinspiration & Biomimetics (2024). https://lnkd.in/etp4rADR I wonder what's with the boom in popularity with fish posts lately? #cfd #hydroacoustics #hydrodynamics #biomechanics #cae #simulation
212 -
Andrew Ng
Much has been said about many companies’ desire for more compute (and data) to train large foundation models. I think it’s under-appreciated that we also have nowhere near enough compute available for inference on foundation models. Years ago, when I was leading teams at Google, Baidu, and Stanford that focused on scaling up deep learning, many semiconductor makers, data center operators, and researchers asked me if AI would continue to make good use of more compute if they kept delivering it. For many desktop workloads, like running a web browser, a faster CPU doesn’t help much beyond a certain point. So do we really need faster and faster AI processors? Each time, I confidently replied “yes!” and encouraged them to keep scaling up compute. (Sometimes, I added half-jokingly that I had never met a machine learning engineer who felt like they had enough compute. 😀) Fortunately, this prediction has been right so far. However, beyond training, we are also far from exhausting the benefits of faster and higher volumes of inference. Today, a lot of LLM output is for human consumption. A human might read around 250 words per minute, which is around 6 tokens per second (250 words/min / (0.75 words/token) / (60 secs/min)). So it might seem there’s little value to generating tokens much faster than this. But in an agentic workflow, an LLM might be prompted repeatedly to reflect on and improve its output, use tools, plan and execute multiple steps, or implement multiple agents that collaborate. So, we might generate hundreds of thousands of tokens or more before showing any output to a user. This makes fast token generation very desirable and makes slower generation a bottleneck to taking better advantage of existing models. That’s why I’m excited about the work of companies like Groq, which can generate hundreds of tokens per second. Recently, SambaNova also showed it can hit hundreds of tokens per second. Incidentally, faster, cheaper token generation will also help make running evaluations (evals), which can be slow and expensive today since it involves iterating over many examples, more palatable. Fortunately, both training and inference are rapidly becoming cheaper. I spoke with Cathie Wood and Charles Roberts of the investment firm ARK, which is famous for its bullish predictions on tech. They estimate that AI training costs are falling 75% a year. If they are right, a foundation model that costs $100M to train this year might cost $25M to train next year. Further, they report that for “enterprise scale use cases, inference costs seem to be falling at an annual rate of ~86%, even faster than training costs.” I don’t know how accurate these specific predictions will turn out to be, but with progress in both semiconductors and algorithms, I do see training and inference costs falling rapidly. This will be good for application builders and help AI agentic workflows lift off! [Original text: https://lnkd.in/dJ9tVGh7 ]
2,90180 Comments -
Jialu Zhang
Check out our latest AI+EDU paper at OOPSLA 2024! Syntactic bugs and semantic bugs are intrinsically different. A tool that fixes syntactic bugs does not check test case results. A tool that fixes semantic bugs can be started only if the target program has no syntactic bug. But what if a program contains both types of errors, which is happening way too often in intro-level students' programs? 😔 Powered by LLM, PyDex is the first tool to repair both syntactic and semantic bugs in real-world Python programs, with an accuracy of 96.5%. Paper link: https://lnkd.in/euW2SWAF
18 -
Nishant Pandey
Optimize for 🚀 Bio Cores First, Silicon 🤖 Cores Second David Heinemeier Hansson makes a compelling case: prioritize human programmers (bio cores) over computer processing power (silicon cores). Here’s why it matters: 👉 Investing in programmer efficiency is far more valuable than solely optimizing for computer speed. 👉 Human labor costs far exceed computing costs. Boosting human productivity should be the main focus. 👉 In the age of AI and high-level programming languages, the bio cores drive your real gains. Bottom line: Make your programmers more productive, and watch your team (and tech) thrive. Read the full article here: https://lnkd.in/gK8HEGtk P.S. : The image below is brilliantly summarized using Napkin.ai for a quick visual breakdown.
6
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Masha Itkina
1 other named Masha Itkina is on LinkedIn
See others named Masha Itkina