Highlights of the Week
We’re back again to roam the world of AI and data science with our pick of intriguing reads from the past week! Have you ever wondered if that article you’re reading was penned by a human or a robot? (Of COURSE you have!) For the developers of the C2PA protocol, the concept of ‘health labels’ for AI-generated content should reduce your consternation. Think of it as nutrition facts, but for your digital consumption!
Meanwhile, the AI supply chain is under threat, and not from the usual suspects. Discover the difference between a prompt injection and a supply chain poisoning (spoiler: one’s a tad more sinister than the other).
And if you’ve ever felt like LLMs were the cool kids on the block, wait till you meet Gorilla, the 800-lb… well, gorilla, that demonstrates the power of bringing API tools … LOTS of API tools … to bear (sic) when using LLMs.
Last but not least, we delve into the harmonious union of LLMs and Neurosymbolic AI as envisioned by Ajit Joaker of the University of Oxford. Grab your digital butterfly net!
Putting a ‘health label’ on Generative AI’s content
A topic of serious concern is that of identifying material that has been created (generated ?) by AI. The article entitled Cryptography May Offer a Solution to the Massive AI-labeling Problem describes an effort that uses cryptography to encode information about the sources of data for an LLM. It sounds like something the ‘good guys’ might go forward with. The developers like to use the analogy of a ‘health labels’ one now encounters on the packaging of food products.
The solution to the problem of identifying AI generated content or data relies on the goodwill of the authoring / publishing organization to “add the label.” C2PA is an open-source internet protocol that uses cryptography to encode provenance information, allowing content creators to opt-in to labeling their visual and audio content with information about where it came from. The developers seem to be quiet about the use of their technique for text based content. The developers claim that this protocol offers benefits over AI detection systems, watermarking, and other techniques. However, since it is not legally binding, widespread adoption across the internet ecosystem, especially by social media platforms, will be needed to make it effective.
Protecting the AI Supply Chain using Data Provenance
Following up on an article we noted a few weeks ago which demonstrated how the ‘LLM supply chain can be poisoned, the authors of Attacks on AI Models Prompt Injection vs Supply Chain Poisoning discuss two types of attacks on AI models: prompt injection and supply chain poisoning. Prompt injection attacks are carried out by users and usually only affect their own session with the model, while supply chain poisoning attacks are performed by external attackers and affect the whole supply chain.
The article uses a bank assistant chatbot as an example to explore the consequences of these attacks and argues that supply chain poisoning is more concerning due to its potential to impact all end-users of AI models. The article concludes by advocating for greater transparency and traceability in the AI model building process to detect malicious interventions.
Tools may be the 800-lb Gorilla
According to Gorilla Powered Spotlight Search, the real power of LLMs demands their integration with tools.
Large Language Models (LLMs) excel in tasks like mathematical reasoning and program synthesis but struggle with effectively using tools via API calls. Anyone who has watched the adoption of cloud computing will have recognized that APIs have played a huge role in bringing real cloud computing to the market. A new LLM model, Gorilla, now claims to urpasses GPT-4’s performance in writing API calls and manages test-time document changes well. This, in turn, enables both up-to-the-second access to Aps and flexible API updates. It also reduces hallucination issues, a common problem with LLMs. The introduction of APIBench, a dataset, helps evaluate Gorilla’s abilities. Gorilla’s successful integration with a document retrieval system has demonstrated improvementss in the the accuracy and reliability of production LLMs.
Ajit Joaker on the Collaboration of LLMs and Neurosymbolic AI
Symbolic AI, an approach which held prominence in the early days of AI research, is making a comeback, as part of a hybrid that uses LLMs as the adjunct. The post artificialintelligence 121 AI - reasoning - LLMs - knowledge graphs - neurosymbolic AIby Ajit Jaokar (Course Director: Artificial Intelligence: Cloud and Edge implementations at University of Oxford) explores the emerging field of neurosymbolic AI, a hybrid approach that combines elements of both neural networks and symbolic AI to address limitations in each paradigm.
The article discusses the role of knowledge graphs in reasoning for AI and identifies several sub-capabilities of reasoning that are crucial for the development of artificial general intelligence (AGI). The article also explores the relationship between knowledge graphs and symbolic AI, outlining similarities in symbolic representation, rule-based reasoning, and logical operations. Finally, the article highlights the advantages of neurosymbolic AI over pure symbolic AI and explores its potential applications in research, learning, and product management.
Thanks for reading. FYI … I do at times use GPT-3.5 to summarize articles. I do so less to have someone/something else do the writing. It’s more to check myself and determine whether I’ve identified the important points. I hope that it improves the quality of these posts. - Rich