Our belief at pi Ventures is to back disruptive innovation, and companies focused on developing 10x differentiated businesses and new technologies to solve large global problems.
Technology is changing rapidly in every domain — be it through all the moves and shakes AI is making or through the creation of alternative proteins in the lab. We meet a lot of entrepreneurs from around the globe building companies in these spaces. We learn from each of these conversations. We are bringing these learnings in a mini-series called ‘DeepTech: Emerging Frontiers’ which will discuss various emerging trends within DeepTech ranging from core AI/ML to recent trends in physical innovation. The mini series is being presented in a reader friendly bite-sized format.
A Decade of AI: From Latent to Inevitable
In recent months, there has been so much buzz around generative AI and chatGPT and if ‘machines are finally coming for our jobs?’ Not since the large-scale adoption of smartphones in the late 2000s has a new tech emergence felt this sweeping.
However, while some developments in the world of AI seem unexpected, quite a lot has happened in the past decade. The boom in recent years has primarily been attributed to improvements in computing infrastructure and the availability of vast amounts of data that can be used for training purposes. The result: artificial intelligence has touched various parts of our lives with some prominent, visible developments such as self-driving cars, autonomous systems, voice assistants, and computer vision; and some more subtle progressions hidden in papers, publications, models, and frameworks.
The timeline depicts a (non-exhaustive) list of some of the critical moments in AI in the past decade. The 2010s saw the rise of deep learning, which, amongst other things, helped make sense of unstructured data. The emergence of GANs started the generative AI movement which got accelerated when transformers led to the formation of Foundational Models.
In parallel, as more and more ML models went to production through the 2010s, it gave its own momentum to the Infrastructure layer for AI. The term MLOps emerged and led to the creation of the foundation of frameworks. Today the models in production are asking for Observability and drift detection to be done in real time. We will also slowly see the shift from batch based systems to a more real time approach of doing ML.
Investible Deeptech Themes in AI
We looked at how the AI landscape has evolved over the last decade or so. Now, we are presenting our thoughts on which are the Investible themes within AI, especially from a deeptech lens.
The first wave of companies solved well-defined numerical problems such as churn and LTV prediction, OCR, etc. using ML. These spaces quickly became crowded, making the creation of differentiated solutions increasingly challenging. However, it opened an opportunity for a category of startups offering “AI tools/ modules” pretty much in an off-the-shelf manner.
Next wave of companies were built on the back of advances in NLP technology. We saw the rise of chatbots capable of answering simple queries and guiding humans through a process flow. Technology that was able to deliver the right customer experience — balancing intuitiveness and technical complexity — defined the winners. Now, with the emergence of large language models, however, the right-to-win of these companies might need a rethink.
Simultaneously, in fields such as healthcare, where obtaining large data sets was challenging, companies that built verticalized custom models also began emerging. Beyond the underlying technology, data was also a strong moat for these businesses. This continues to be an area where valuable companies can be built.
Today Generative AI is creating a significant buzz — a lot of that is coming from experimentation / building on top of foundational models. While that is unlocking interesting use cases, we are looking for more disruptive companies in the space. We will cover them in the next blog.
While the above was going on in the Applied AI side, the drive to get the models to production quicker fueled the early growth of the AI frameworks in the Infrastructure space. These were mainly dominated by big companies. However, newer needs such as drift detection and observability which need a differentiated technology are forming an interesting proposition for investments. One of the other trends we are watching closely is how the ML & AI systems which are largely batch based will transform themselves into real time continual learning frameworks.
To summarise the investible themes for us @ pi, we looked at differentiated Chatbots, Vertical use cases and some early use cases of Gen AI in our Fund 1 while now in Fund 2, we are looking at Infra use cases as well in a strong manner as shown in the diagram.
Evolution of Generative AI: A brief timeline
Generative AI has taken the world by storm. Over the next few posts, we will share a few thoughts on how we’re thinking about the space here at pi
Machines capable of defeating humans at Go, creating original artwork, and predicting structures of proteins. How did we get here?
The first development that gave machines the power of creation dates back to 2014, with the invention of generative adversarial networks (GANs), capable of producing different variants of a seed image.
The next couple of years saw the emergence of language models that could perform tasks such as transcription and basic summarization. However, these models were not capable of understanding contextual nuances that occur so often in language, limiting large scale applications.
Transformers upended this paradigm. Introduced by Google in 2017, transformers are a neural network architecture capable of understanding context and thus, meaning in language. They serve as a building block in several models even today, and paved the way for ‘foundation’ models such as GPT (Generative pre-trained transformer), which performed much better on a wider variety of language tasks.
Over the next few years, much larger models (both computer vision and language) trained on a vast corpus of data became increasingly performant on language and computer vision tasks — culminating in the launch of GPT-3, a 175 billion parameter model trained on text corpora of the entire web and more.
These models are heavy and expensive to train, arguably possible at the time only by well-funded labs or Big Tech. Massive improvements in hardware have led to AI scientists getting more bang for their buck, allowing them to train even larger models and we began to see results approach human performance with breakneck pace.
2022 proved to be a watershed moment for Generative AI, with large-language models (LLMs) being adapted to solve problems across various industries. Models like ChatGPT now allow us to perform advanced language tasks such as generalised natural language queries with high accuracy. In the computer vision domain, diffusion techniques led to the launch of photorealistic text-to-image generators such as DALL-E 2 and Midjourney
To add to this, open-source LLMs such as Stable Diffusion (alternative to DALL-E 2) and Openjourney (alternative to Midjourney) have cropped up. Their deployment has been made easy via platforms such as Huggingface — making generative AI infrastructure accessible to devs and creators alike. Empowered with the right toolset, they can build applications with the appropriate LLM for their data to power a litany of use cases across industries.
Generative AI is transforming every industry as we know it . With the right starter prompts to an appropriate large language model (LLM), it is now possible to generate personalized messages from one’s favourite celebrities, generate fully functional websites and even create fully animated 3D worlds.
Investible Deeptech Themes in Generative AI
1. AI based coding co-pilots - Devs are increasingly turning to AI chatbots and GitHub CoPilot rather than Stack Overflow message boards. (Stack Overflow traffic was down ~14% in March) Coding co-pilots can make suggestions and explain code, considering what users type and other context from their accounts, like the programming languages they’re using. Challenges around security vulnerabilities in AI generated code and copyright implications of training on publicly available code remain to be solved
2. 2D / 3D asset creation - Marketers, game developers, and artists have to continually produce 2D and 3D content with limited time and resources. Increasingly, they will be able to set creative direction and then hand off much of the time consuming and repetitive execution to AI generators - unlocking large creative markets like gaming, VR, and cinema. Gaming in particular holds enormous promise, with some interesting applications being scene and texture generation based on text or image prompts, infrastructure to train one’s own 3D asset generator based on their art style
3. Drug Discovery and Therapeutics - Designing a novel drug requires scientists to select and validate candidates experimentally from a vast chemical space through trial and error. Using generative AI and a combination of other computational tools (such as Alphafold, NVIDIA’s BioNeMo) scientists can quickly iterate over the design space to tailor biological molecules for therapeutics with desired properties. (Gartner forecasts 50% of all drug discovery initiatives to have some element of Generative AI in them by 2025) Some interesting applications include protein sequence design, lead optimisation and novel molecule generation. We’re super excited about generative biology and have had the privilege to partner with Aridni and Trisha (Can tag) from ImmunitoAI, who are doing great work in the antibody discovery space.
Another overarching theme across Generative AI that interests us is the core infrastructure layer (LLMs and LLMOps) powering all these applications.
AI Infrastructure
While there has been a lot of buzz around generative AI, a critical segment that is often overlooked is AI infrastructure or AIOps. AI infra refers to the set of tools that power the entire AI model lifecycle from training, building to deployment & monitoring in production.
There has been an exponential growth in organizations implementing AI solutions. However, the transition from research to production remains a critical challenge for enterprises. 90% models fail to make it to production!
Compared to DevOps tools, MLOps tools need to deal with a lot more complexity as ML models continue evolving over time due to constant updates with new data or algorithm improvements. Hence, it becomes crucial for organizations to adopt robust MLOps tools that enable seamless integration of these changes.
AI infra is a rapidly evolving space that is currently fragmented because of diversity in data types and use cases that these tools have to support. While there are a lot of tools that are offering workflow solutions, we believe tools that have a core IP combined with an engineering workflow solution will emerge as winners in the long run.
Over the last few years, we have seen a multitude of tools emerge to help businesses realize the benefit of AI. MLOps started with a focus on data collection, model development & deployment processes & has gradually delved into more complex things.
Investible Deeptech Themes in AI Infrastructure
1. AI Observability Platform - It is difficult to train models on all possible real world scenarios in the lab. This is one of the major challenges that leads to a gap in the performance of models in the lab vs in production. Apart from data drift, concept drift can also occur over time, resulting in a drop in accuracy. With a good observability platform, teams can automatically keep their training data as close to the real world data as possible.
2. Real Time ML Platform - Real time ML allows organizations to process and analyze data as it is generated, drastically reducing latency compared to traditional batch processing. This is particularly crucial for mission critical use cases like fraud detection & cybersecurity. For example, real time ML enhances fraud detection by identifying suspicious behaviors instantaneously as well as continuously learning & adapting based on fresh data.
3. Synthetic Data Generation Platform - The growing demand for training data in ML has led to the emergence of synthetic data as a viable solution. Getting quality real world data is a challenge due to its paucity, cost, privacy & ethical concerns. Synthetic data can help comply with regulations like GDPR, improve model accuracy & reduce bias, while enabling faster access to datasets at a fraction of the cost.
Blockchain Infrastructure
It’s no secret that the blockchain industry has been facing a downturn, with price crashes, hacks, stablecoin implosions and lower trading volumes. However, we believe maturing blockchain infrastructure will address concerns around security, scalability and privacy - ushering in a new wave of Web3 adoption, while powering new applications in traditional industry in parallel.
Investible Deeptech Themes in Blockchain Infrastructure
1. Decentralised self-custody - “Not your keys, not your crypto” is a saying oft heard in the Web3 world, with frequent hacks and loss of private keys leading to the loss of user funds. Wallet providers and exchanges face challenges around key management and user signature authentication.
Decentralised self-custody empowers users with the sole authority to move funds from their wallets, and we’re proud to have backed Silence Laboratories, which uses threshold signature schemes and multi-party computation to power self-custody for wallet providers and enterprises alike.
This technology can also be extended to applications across large enterprises like healthcare companies or financial institutions that deal with confidential data.
2. Data Privacy and Security - While privacy might seem antithetical to blockchain and its principles, solving privacy concerns is crucial for it to see meaningful applications across industry.
Zero Knowledge Proofs (ZKPs) are an exciting frontier in privacy-preserving blockchain technology and use a series of cryptographic proofs to allow certifying information without disclosing it. While they’ve been used across the crypto sector to enhance the scalability of L1s via off-chain transaction verification and efficient consensum mechanisms, ZKPs also have immense potential in a wider range of applications where sensitive information is required.
For example, our personal information is fed into machine learning models regularly in the course of daily life like while applying for a credit card. Zero-knowledge proofs can allow users to have more control over their data by letting them generate an inference locally and submit a proof instead of relying on a third party to harvest their private information.
3. Audit assistants - Audits in Web3 are broken. Customers want projects they interact with to be audited by expert code reviewers. Minimal reviews can cost between $5k to $15k, reaching up to $100k+ in case of complex projects and take more than a month. Powered by the newest LLMs, AI-powered audit assistants with features such as static and dynamic code analysis of smart contracts, test-case generation in Javascript, smart contract programming co-pilot etc. can help streamline this process for both projects and auditors alike.
Beyond Digital: Physical Innovation
Previously, we focused on the potential of digital technologies such as AI and Crypto Infrastructure. In this section, we will focus on the innovations in the physical domain – something we term “Beyond Digital”. We will talk about, Material Science which is the key horizontal technology that will play a pivotal role in pushing the boundaries of what is possible. We are excited about material science and believe that "Just as data is to AI, material science is to physical innovation".
To define it, material science is a multidisciplinary field that encompasses science, engineering and technology of developing new materials with unprecedented properties and functionalities. The importance of material science reflects in how different time periods of humanity have been named after the material of the time - for example, the bronze age, iron age, copper age, etc. Even in recent history, materials innovation has revolutionized multiple industries. For example, we enjoy the benefits of electronics thanks to all the advancements in semiconductor materials and processes. We have now begun to hit the limits of the existing semiconductor materials and the next wave of disruption in the industry will again come from novel alternate materials.
Material innovation is a multi-step process that involves discovery, screening, synthesis, and manufacturing. Scientists today employ various techniques such as AI screening, generative models, robotic automation, high throughput screening, quantum simulations, 3D printing, and more for material outcomes. As a result, we have gradually moved away from a trial-and-error-based approach which used to take decades to tech-driven approaches that have reasonably predictable outcomes and take significantly less time. This acceleration of the process is making the space attractive for investments.
Material science not only accelerates the wheels of industry but also plays a significant role in solving existing societal challenges, for example - alternative food and biostimulants have the potential to solve food scarcity, and biodegradable materials and harboring circular economy through recycling can go a long way in leading to a more sustainable planet.
While the impact of material science spans across industries, we will focus on two areas - synthetic biology and energy storage.
Synthetic Biology
Synthetic Biology is already impacting our lives - from animal-free meat products to quick vaccine developments that helped us deal with a global pandemic. While bio based research has taken various forms since the last century, it has recently firmly entered the mainstream. This is marked by a shift from an approach based on experimental research to a more defined engineering approach.
Synthetic biology is an emerging field which applies engineering principles to the design & modification of living systems. It enables the programming of cells similar to how a computer scientist programs a computer. Biological systems like cells, bacteria, fungi behave like tiny factories, and with some modifications, these can be programmed to give us useful end products. Frameworks like the iterative Design-Build-Test-Learn cycle help systematize the discovery process.
From designing the first simple genetic circuits in the 2000s to designing complete genomes capable of various tasks, we’ve come a long way. Key highlights include the advent of infrastructure technology like CRISPR-CaS that enabled precisely editing genes at low cost. Next generation sequencing techniques enabled large scale testing, reduced costs, and further sped up the design cycle.
The tech developments have created various opportunities for startups in this space.
Investible Themes in Synthetic Biology
1. Alternative proteins - While some companies have seen success in the plant-based protein space, there is demand from the market for cultivated proteins from animal sources that remain true to the taste and feel of meat and dairy, while being cruelty free and offering climate benefits. Our portfolio company, Zero Cow Factory, aims to revolutionize the dairy industry by producing animal-free dairy protein using precision fermentation
2. Drug discovery & precision medicine - Drug discovery has traditionally been a lengthy, iterative proposition, with long lead times. AI & ML techniques help expand the design space, as well as enable quicker exploration of this expanded design space. Our portfolio company, Immunito AI, uses AI to discover novel antibody therapeutics
Another important application is precision medicine. Human beings are very diverse on various scales such as age, ethnicity, environment, lifestyle etc. This diversity is reflected in the individual variability of genes. New disease treatments take this individuality into consideration and aim to develop treatment approaches tailor-made for each individual
3. Biomaterials - The adoption of biomaterials can impact a variety of industries - construction, agriculture, manufacturing, textiles to name a few. Beyond replacing harmful petrochemicals, and offering sustainable manufacturing pathways, certain biomaterials can also be engineered to have special properties
Energy Storage
Picture a scenario where excess energy generated during sun-soaked days can be captured and stored seamlessly, only to be unleashed during peak hours of demand thereby mitigating the need for fossil fuels. We are in the midst of a critical energy transition & the key enabler of this transition is novel energy storage systems. ESS (Energy storage systems) bridge the gap between energy generation and consumption, and will transform the landscape of power distribution as we know it.
Till now, we have been using a one-size-fits-all approach where lithium based batteries are being used for a majority of energy storage applications. The world is realizing that the entire spectrum of energy usecases cannot be covered by lithium batteries alone due to the specific property requirements of that usecase. Startups are coming up with alternate chemistries focused on solving a niche energy requirement.
Investible Themes in Energy Storage
1. ESS for stationary applications & microgrids - The demand in this space is expected to become 30x by 2030. ESS with large cycle life, lower use of rare earth/costly materials is the key for integration of solar & wind energy to the grid. Peak load management, commercial & industrial backup, grid stability, microgrids & frequency regulation are some of the usecases of ESS in this space
2. Energy Storage enablers: Just like D2C enablers that ensure a seamless infrastructure for selling any D2C product, we believe there is a whole lot of behind the scenes work going on which is going to support the energy storage systems. Companies are working on creating superior battery management systems, AI based tools for energy demand management & battery designs using simulation. There is also exploration of efficient ways of mining as well as material recycling which has seen significant traction and will play a key role in the future of ESS
3. Energy Storage enablers: Just like D2C enablers that ensure a seamless infrastructure for selling any D2C product, we believe there is a whole lot of behind the scenes work going on which is going to support the energy storage systems. Companies are working on creating superior battery management systems, AI based tools for energy demand management & battery designs using simulation. There is also exploration of efficient ways of mining as well as material recycling which has seen significant traction and will play a key role in the future of ESS