Ibrahim Khalilov

I build and evaluate AI-enabled systems, with a focus on agent behavior, privacy, security, robustness, and reproducible experimentation.

Ibrahim Khalilov - Professional headshot

About Me

I am a PhD student in Computer Science at Johns Hopkins University. My research focuses on AI and agent evaluation, privacy and security, and reproducible systems research.

A major part of my recent work examines how computer-use agents behave under adversarial interface conditions, including fine-print injections and dark-pattern interfaces, and where human oversight can improve outcomes. I also build mobile-system research infrastructure for controlled and repeatable experimentation.

Before pursuing my PhD, I worked as a software engineer at various companies, including Virginia Institute for Spaceflight and Autonomy (VISA), Thrillworks, and Fairly AI. This industry experience has given me a unique perspective on how to bridge the gap between academic research and practical, real-world applications.

Research

PriviSense: A Frida-Based Framework for Multi-Sensor Spoofing on Android

Ibrahim Khalilov, Chaoran Chen, Ziang Xiao, Tianshi Li, Toby Jia-Jun Li, Yaxing Yao

International Conference on Software Engineering (ICSE) 2026PublishedResearch Paper (Lead Author)

Comparing Human Oversight Strategies for Computer-Use Agents

Chaoran Chen, Zhiping Zhang, Zeya Chen, Eryue Xu, Yinuo Yang, Ibrahim Khalilov, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao, Yaxing Yao, Tianshi Li, Toby Jia-Jun Li

UIST 2026In reviewResearch Paper (Co-Author)

Beyond Permissions: Investigating Mobile Personalization with Simulated Personas

Ibrahim Khalilov, Chaoran Chen, Ziang Xiao, Tianshi Li, Toby Jia-Jun Li, Yaxing Yao

HAIPS @ CCS 2025In reviewPosition Paper (Lead Author)

The Obvious Invisible Threat: LLM-Powered GUI Agents’ Vulnerability to Fine-Print Injections

Chaoran Chen, Zhiping Zhang, Bingcan Guo, Shang Ma, Ibrahim Khalilov, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao, Yaxing Yao, Tianshi Li, Toby Jia-Jun Li

JournalPublishedWorkshop Paper (Co-author)

Toward a Human-Centered Evaluation Framework for Trustworthy LLM-Powered GUI Agents

Chaoran Chen, Zhiping Zhang, Ibrahim Khalilov, Bingcan Guo, Simret A. Gebreegziabher, Yanfang Ye, Ziang Xiao, Yaxing Yao, Tianshi Li, Toby Jia-Jun Li

HEAL @ CHI 2025PublishedResearch Paper (Co-author)

Dark Patterns Meet GUI Agents: LLM Agent Susceptibility to Manipulative Interfaces and the Role of Human Oversight

Jingwei Tang, Chaoran Chen, Junnan Li, Zhiping Zhang, Bingcan Guo, Ibrahim Khalilov, Simret A. Gebreegziabher, Yanfang Ye, Tianshi Li, Toby Jia-Jun Li

arXiv preprint arXiv:2509.10723PublishedResearch Paper (Co-author)

Projects

PriviSense: Reproducible Mobile Systems Evaluation

Published

An on-device Android instrumentation toolkit for running controlled experiments with mobile applications. PriviSense uses dynamic instrumentation to modify sensor and system signals at runtime, making it possible to study how apps respond to different simulated contexts without rewriting the target applications.

Key Highlights:

  • Runtime sensor and system-signal spoofing in unmodified Android apps
  • Reproducible testing workflows using automation, screenshots, and logs
  • Published as an ICSE 2026 demo paper

Technologies:

FridaAndroidPythonTypeScriptTermuxAutomation

Safety and Robustness Evaluation for Computer-Use Agents

Ongoing

Research on how LLM-powered computer-use agents behave under adversarial interface conditions. This work studies whether agents recognize or fall for manipulative UI elements, where failures occur, and how different forms of human oversight can improve outcomes.

Key Highlights:

  • Evaluated agent behavior under fine-print injections and dark-pattern interfaces
  • Contributed to reproducible testbeds for analyzing agent failure modes
  • Studied when human oversight helps improve agent robustness

Technologies:

LLM AgentsGUI AutomationEvaluationRed-TeamingHuman OversightPython

Software Engineering Experience

HFC Specialist

Scale AI

May 2026 - Present

Selected as a specialist with the Human Frontier Collective, contributing to expert evaluation work for frontier AI systems and technical reasoning projects.

Software Engineer

Thrillworks

2021 - 2023

Built and shipped multiple mobile and web applications for large corporations including President's Choice (PC Financial, PC Insurance) and Ryobi. Created responsive and accessible websites using modern web technologies.

Key Achievements:

  • Delivered production applications for major corporations like PC Financial
  • Built mobile and responsive apps with Flutter, Gatsby, React, and Tailwind CSS
  • Developed and maintained microservices using NestJS

Technologies Used:

ReactNode.jsPostgreSQLAWSTypeScript

Robotics Software Intern

Virginia Institute for Spaceflight and Autonomy

Summer 2024

Developed applications to control robots using WamV ROS network for remote control and feedback reception. Fine-tuned pretrained models for regression and classification tasks, integrating them into control systems.

Key Achievements:

  • Built robot control application using WamV ROS network architecture
  • Fine-tuned ML models for autonomous navigation and control systems
  • Implemented effective communication between ROS nodes and Docker containers

Technologies Used:

ROSPythonOpenCVC++Linux

Full-stack Developer

Fairly AI

2020 - 2021

Built scalable web applications for AI-powered fairness assessment tools. Worked with machine learning teams to integrate ML models into production web applications.

Key Achievements:

  • Implemented real-time notification system using WebSockets
  • Built AI risk management dashboard with Material-UI
  • Integrated Firebase authentication and developed user management system

Technologies Used:

NestJSReactUnityFirebasePython

Get in Touch

Let's Connect

I'm always interested in discussing research collaborations, potential projects, or just having a conversation about AI, mobile development, and the future of technology. Feel free to reach out!