Researchers from the Technion and Tel Aviv University have developed an artificial intelligence system that translates protein sequences into natural-language descriptions, a tool they say could help accelerate drug development, biotechnology and material design.
The system, called BetaDescribe, was presented in a paper published in the journal PNAS. It is designed to analyze protein sequences and generate detailed descriptions of their functions and characteristics.
Protein analysis is central to medicine and biotechnology, but experimental characterization is often slow and costly. The researchers said BetaDescribe could help narrow the gap between the hundreds of thousands of proteins already characterized in laboratories and the billions, or even trillions, believed to exist in nature.
Unlike traditional methods that rely mainly on comparing unknown proteins to known sequences, BetaDescribe combines a generative model with verification and evaluation mechanisms. The researchers said this allows it to infer protein function even when a protein is not closely related to previously studied examples.
The system can describe functional properties, catalytic activity, roles in metabolic processes and possible binding sites relevant to medical and other uses. The team demonstrated the system by successfully describing six previously uncharacterized proteins.
The paper was led by doctoral student Edo Dotan under the joint supervision of Prof. Yonatan Belinkov of the Technion’s Henry and Marilyn Taub Faculty of Computer Science and Prof. Tal Pupko of Tel Aviv University’s School of Life Sciences.
Doctoral student Edo Dotan Photo: Technion
Prof. Yonatan Belinkov Photo: Hadas Parush
Prof. Tal Pupko Photo: Chen GaliliOther co-authors included Prof. Eran Bacharach, Prof. Marcelo Ehrlich and doctoral student Iris Lyubman of Tel Aviv University.
The research was supported by the Israel Science Foundation.



