Button TextButton Text
Download the asset

From Data-Driven to Knowledge-Driven: The Next Step for AI

From Data-Driven to Knowledge-Driven: The Next Step for AI

Combining human knowledge and data to build better predictive models

Whether you're aware of it or not, you are experiencing exponential change in the world around you, and your mind is constantly working to make sense of and understand these changes. In just the past 30 years we’ve gone from DOS dial-up to cellular enabled smart watches;  from never getting into a stranger's car to getting into a strange, driverless one; or from looking through a library’s catalog to being able to access the knowledge of the entire internet with a simple question to ChatGPT. These advancements would have been reality-shattering as recently as our grandparent's upbringing, and yet within their lifetimes such radical augmentations to the human experience have become merely business as usual.

In episode 86 of the Feedback Loop Podcast: "Knowledge-First AI, GPT3, and More," entrepreneur and technologist Christopher Nguyen focuses on exploring the many facets of artificial intelligence, including his ideas around knowledge-first artificial intelligence (AI), bad training data for AI, issues with the black box, Generative Pre-trained Transformer 3 (GPT3), deep fakes and more. 

Thanks to the advancements in computing, large scale data-models and neural networks we are getting closer to touching human intelligence through our technology– or at least being able to mimic it. For many, this innovative progression seems normal as we have always been augmenting ourselves in new ways over the years using technology that has already existed. However, Nguyen highlighted that now that our range of technology can encroach the line of human intelligence, we may be on the brink of something very different and a little intimidating. "There is something very qualitatively different and powerful but also very disturbing when we think about augmenting our minds with technology that may possibly be smarter than us." 

Listen to the full episode now or continue reading for a few of entrepreneur and technologist, Christopher Nguyen’s insights to keep in mind as we enter this new paradigm of AI progress: 

Data alone isn’t enough. 

"You can have terabytes of data, but it doesn't contain the expertise that that your engineer or even your user has accumulated in the industry over the last 20, 30 years in their brain. So knowledge-first AI  is about the combination of human knowledge and data to build better models than predictive models than you could do alone with data." -Christopher Nguyen

As the CEO & Co-Founder at Aitomatic, Inc.– the world's only Knowledge-First AI Engine for Industrial AI– Nguyen centers his efforts on helping to translate domain-specific knowledge from natural language to better ML models for the industrial or physical industry as he likes to call it. For the most part, we’ve currently experienced “data-first AI” where we use large data sets to “teach” the AI. This approach has proven useful in areas that lend themselves well to the digital industry - or more quantitative use cases.  Bits go in -> Bits process -> Bits go out. 

The integration of Artificial Intelligence and Machine Learning in the physical industry has proven to be a difficult task. Despite the fact that we are still physical beings, who drive cars and eat fish, the physical industry is a $25 trillion market that has been struggling to find success in utilizing AI. This is because, while industries such as digital marketing and social media have been able to leverage AI to predict clicks and engagement, the physical industry is facing a different challenge. The issue is that there is not enough data to train AI models in the same way as in the digital industry. This is where knowledge-first AI comes in as a solution. By combining human knowledge and expertise with data, better models and predictions can be made. This approach allows for a combination of the industry-specific knowledge of engineers and users, accumulated over the last 20-30 years, with the vast amounts of data available today. 

Learn how the knowledge-first AI approach is proven as a concept in the material space. 

ChatGPT isn’t as novel as you may think, but that doesn’t make it any less exciting. 

“I think GPT and what a lot of people now see with ChatGPT is one of those wonderful moments. You know exactly what's happening behind the scenes. But it's still awesome, right? So like, the first moment, Steve Jobs 2007 stood on the stage and started, you know, swiping the screen, and then then the thing moves, and it speeds up and slows down. It's like, wow, right? So there's this wild moment.  I'm one of those people that says I want to relish this wild moment, I don't want to dismiss it. And I don't want to be too fearful of it.”  - Christopher Nguyen

Open AI isn’t the only one working on an output like ChatGPT but it has done a great job of getting it to the public with pretty strong guardrails to ensure positive user experiences. While it can feel like it’s intelligent, it’s not at the level of true AGI. However, by releasing a free prototype, OpenAI is able to leverage the public to help capture important training data they can use to further train and make future versions of ChatGPT that much more convincing. 

According to Nguyen, a very simplified version of what's really behind ChatGPT is a pretty basic learning algorithm that’s been built on-top of GPT-3 to create a deep neural network. Which doesn’t make it any less fun to use. Within the deep neural network, ChatGPT leverages the information it can access to do the computation of the so-called inference.  By providing the neural network with millions of examples of data, it learns how to "weight" the data across many layers, causing it to continually reshape itself to build more accurate inferences on each pass. This allowed ChatGPT to build a language model that understands how words typically relate to one another. So when users come along and ask a question, the AI is able to go through the layers of the neural net and follow the "weighted" paths to the most likely relationship of words. And because those paths were shaped by data from human text, the result appears to be sophisticated communication. You give a bunch of inputs, it multiplies those inputs by some weights, and then it sums them up. Then it sends those outputs to the next layer. And that's it. And then you repeat it... millions, billions, or even trillions of times usually. The process of training it to provide users with conversational responses was done by allowing it to observe and to try - similar to how we teach human children. Otherwise known as supervised and reinforcement learning techniques. 

“You can intelligently have a dialogue with this thing. Five short years ago, we were talking about initiative and creativity…Right, this thing appears and it appears to have knowledge about the world, right? It's not just repeating something. As a tool, I'm already using it to generate ideas….So in a way [it’s] already teaching me something right? At least as a friend. Maybe it’s not smarter than me or maybe it is smarter than me?  But it's more than a dictionary, it's more than Google. Right? It seems to have an understanding and a knowledge of the world.” 

We’ve focused too much on intent and not on impact. 

The topic of ethics in technology can be complex, with many gray areas. To ensure that technology progresses in a controlled and intelligent way, it is important to consider the ethical implications during discussions. Nguyen emphasizes the distinction between intent and impact, as negative consequences can sometimes occur unintentionally. To align with ethical boundaries, it can be beneficial to incorporate elements of the human condition into technology, such as preparing data with biases, which is necessary for machine learning to function. Additionally, those creating and developing technology must be aware of the potential consequences and take responsibility for how it will be used. This includes having the knowledge and education to make decisions that are in the best interest of those who will be impacted by the technology.

The future can be disturbing but that doesn’t mean it’s wrong. 

The question at the back of everyone’s mind when you talk about AI is a fearful one.  Will it make us obsolete? Nyguen started off by saying that like any tool there are dangers involved. 

However, the power of a tool is not inherent within the tool but in where it is used and the scale in which it is applied. To avoid falling into a dystopian future, we must make intentional decisions and solutions to have technology move in calculated directions rather than letting it flow into potentially wrong directions.

To assume the potential that humans can one day be obsolete might be favorable for humanity as our emotional response to those thoughts keeps us improving ourselves. But, as Nguyen highlighted, we have always found ways to augment ourselves, and technology has always augmented our abilities. So what we see today, like GPT, large language models, multimodal applications, etc., can make us more powerful than before, just as technology has done throughout the years. 

“The future has always been disturbing, right? So for example, augmenting our own mind, right, directly with these models, I think is the path of the future, even though it seems very disturbing to some today. But you know, just as me sitting here in a little box with a laptop can be quite disturbing to somebody who worked in the field 100 years ago.” 

And just as we use technology to augment ourselves and improve on the things we do, technology needs to maneuver to correspond to human nature as well. Humanity will infinitely need to adapt as technology adapts. However, Nguyen poses the question: "What happens if technology adapts faster than our biological rate? What does that mean?" 

And now that technology has become more integrated with human knowledge and behavior, it is no longer a separate topic regarding discussions about ethics. 

The value of a liberal arts education with a strong foundation in STEM is undeniably invaluable. 

A well-rounded education was and will always be the cornerstone of the betterment of humanity. However, oddly enough, it seems as though we are slowly moving away from a type of humanity based on a well-rounded education and leaning more into situations where people are exploring and managing technology without a broad humanities education. People are creating code, for example, that reaches billions without thinking about philosophy or deeper issues before releasing technology into the world.

With his emphasis on education, Nguyen highlights the launch of his university, Fulbright University in Vietnam, which is part of the Fulbright Program at the Kennedy School. Interestingly, it specializes in the liberal arts with strong science and engineering foundations. He claims that over the next 10-20 years, the graduates from the institution will be leaders of society, business and industries. The value of a liberal arts education with a strong foundation in STEM is undeniably invaluable.

To reiterate the importance of having a well-rounded education, Nguyen referenced a literature professor at UCLA explaining the value of literature that resonated with him. "What literature gives you is the ability to step into the mind of other people and other cultures and experience things that otherwise would be unreachable to you, and thereby become a better person." Likewise, exploring technology allows us to experience a myriad of things we otherwise wouldn't from people and places all around the world. That kind of power should be handled with a well-rounded education and a lot of care.

The human brain is potentially the best machine learning algorithm and the best inference model. Therefore, having a well-rounded education can help people become better engineers using the model of data variance. This is also similar to multilingual people. They know that learning a new language can make them better at the languages they already know, and we already see such things in language models happening now.

On the other hand, some may think that some of our most advanced machine learning and tools, like GPT3, still work very much like a black box AI. Therefore, regardless of our educational background or model of the world, we aren't able to shape too much of what is going on in the black box. Referring back to Nguyen, this is all about the discipline of alignment. In the same way you would align your own child with your values, for example, we can align our technology the way we intend. If we build technology and let it run wild, it will end up in a random direction we never intended. The ethics of the discipline of alignment are also one to bear in mind as the field of alignment is still emerging.

Still, with a foundation of knowledge, there will always be things beyond our scope of understanding, such as a black box.

Listen to the full episode of the feedback loop for answers to the following questions: 

  • Can the black box ever be solved?
  • Can AI be creative? 
  • How will AI disrupt IP law? 
  • How do we need to reconsider intellectual property?
  • How do you take advantage of the new AI economy emerging? 

Valeria Graziani

Valeria Graziani is an accomplished marketer and copywriter. She lives in Arizona.

Download the asset