Tang, N., Chen, M., Ning, Z., Bansal, A., Huang, Y., McMillan, C., Li, T.J.-J., “Developer Behaviors in Validating and Repairing LLM-Generated Code Using IDE and Eye Tracking,” Proceedings of IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC, 2024, pp. 40-46, DOI: 10.1109/VL/HCC60511.2024.00015.
The rise of tools like GitHub Copilot, powered by large language models (LLMs), is changing software engineering. This study explored how developers validate and fix code generated by Copilot and how knowing the source of the code affects their process. In a study with 28 developers, half were told the code was LLM-generated, and half were not. Data from IDE interactions, eye-tracking, and interviews showed that without being informed, developers often didn’t recognize the code’s origin. Developers adjusted their behavior with LLM-generated code, such as switching more between code and comments and rewriting sections. Awareness of the source improved performance but increased workload and reliance on Copilot. The study highlights how understanding LLM-generated code can improve collaboration between developers and AI tools.
Fig. 1.
Top-2 most frequent behavior transition patterns across all participants.