The end of developers? The world’s first AI software engineer, Devin – now developers enter the AI era!
Could this become reality one day? It seems the day when such thoughts become reality is not far off.
The world’s first AI software engineer, Devin, has been released and is causing quite a stir.
Developed by Cognition in the USA, Devin is the world’s first AI software engineer.
Although there are not enough use cases yet to make an accurate assessment, if it works as shown in the demo videos, it could mark a significant milestone in the field of AI software.
- Development Process of Devin
The development of Devin is based on high technical expertise and innovative AI research. The initial goal was to integrate long-term thinking and planning capabilities into AI. To achieve this, the development team focused on machine learning, natural language processing, code generation and understanding, and technologies that enable collaborative interaction with humans.
Devin is designed to work in various development tools and environments, mimicking the real software development process. Through constant iteration and experimentation, Devin gradually improved its ability to solve real engineering problems. - Tasks Devin Can Perform Devin can perform a variety of tasks, including:
- Learning new technologies and frameworks
- Building and deploying applications from start to finish
- Finding and fixing bugs in codebases
- Training and fine-tuning AI models
- Handling bug reports and feature requests in open-source repositories
- Contributing to mature production repositories
- User Satisfaction Users are highly satisfied with the efficiency and autonomy of this AI. Engineers find value in Devin’s ability to autonomously perform complex tasks and quickly solve problems. Users particularly appreciate Devin’s capability to report progress in real-time and incorporate feedback.
- What Makes Devin Stand Out Devin is expanding the boundaries of AI technology. This AI goes beyond simple task automation to participate in complex problem-solving and creative engineering tasks. Devin’s uniqueness stems from real-time collaboration, continuous learning, and genuine teamwork with humans. These capabilities transform Devin from a mere tool into a core member of the software development team.
- SWE BENCH Test Results
Devin’s performance was evaluated on SWE-bench, a challenging benchmark that requires agents to solve real-world GitHub issues found in open-source projects. Devin successfully resolved 13.86% of the issues end-to-end, significantly surpassing the previous best record of 1.96%. Even when given the exact files to edit, the best previous models could only solve 4.80% of the issues.
What is SWE-bench? SWE-bench is a benchmark for evaluating the performance of AI in the field of software engineering. This benchmark tests the ability of AI to solve real problems found in open-source projects, similar to actual GitHub issues. It aims to assess how effectively AI can understand these problems, propose solutions, and perform actual code modifications.
- Devin’s Demo Videos
Devin can quickly learn new things. After reading a blog post, Devin runs ControlNet on Modal to create images with hidden messages for Sara.
Devin can build and deploy apps from scratch. Devin creates an interactive website that simulates the Game of Life, gradually adds features requested by the user, and then deploys the app to Netlify.
About the Game of Life – “Game of Life” or “Life Game” is a cellular automaton devised by British mathematician John Horton Conway in 1970. It is a simulation game played on a grid of cells that can be in one of two states, “alive” or “dead,” based on a set of simple rules. The game is known for its Turing completeness, meaning that given the right initial state and rules, it can simulate any computation or algorithm.
Devin can autonomously find and fix bugs in codebases. Devin assists Andrew in maintaining and debugging his open-source competitive programming book.
Devin can train and fine-tune its own AI models. Given just a link to a research repository on GitHub, Devin sets up fine-tuning for a large language model.
Additionally, Devin can handle bugs and feature requests in open-source repositories. Given just a link to a GitHub issue, Devin takes care of all the necessary setup and context gathering.
Devin can contribute to mature production repositories. This example is part of the SWE-bench benchmark. Devin resolves a bug related to logarithm calculations in the sympy Python algebra system. Devin sets up the code environment, reproduces the bug, codes the fix, and tests it.
Future Development Model
The future development of Devin will focus on further enhancing AI’s understanding and problem-solving abilities. This includes automating more complex projects, providing better customized solutions for users, and developing more efficient collaboration methods. Additionally, Devin’s development team will continue to focus on research into the ethical use of AI and responsible integration. As AI technology advances, Devin will play a crucial role in driving innovative changes in the field of software engineering, turning the future we imagine into reality.
Click to visit the Cognition Blog
Share this content:
댓글을 남겨주세요!