Marcus, Andrian

Permanent URI for this collection

Andrian Marcus is a Professor of Computer Science and the Principal Investigator of the Software Evolution and Testing Lab. He also serves on the faculty of the SEERS (SoftwarE Evolution ReSearch) Group. He is best known for his work on "using text retrieval and analysis techniques on software corpora for supporting comprehension during software evolution." Dr. Marcus's research interests include:

  • Software Engineering
  • Software Evolution and Maintenance
  • Program Comprehension
  • Software Analysis and Metrics
  • Cognitive Models of Software Development Processes

ORCID page


Recent Submissions

Now showing 1 - 2 of 2
  • Item
    Reformulating Queries for Duplicate Bug Report Detection
    (Institute of Electrical and Electronics Engineers Inc.) Chaparro, Oscar; Florez, Juan Manuel; Singh, Unnati; Marcus, Andrian; 0000-0001-5450-5598 (Marcus, A); 58848719 (Marcus, A); Chaparro, Oscar; Florez, Juan Manuel; Singh, Unnati; Marcus, Andrian
    When bugs are reported, one important task is to check if they are new or if they were reported before. Many approaches have been proposed to partially automate duplicate bug report detection, and most of them rely on text retrieval techniques, using the bug reports as queries. Some of them include additional bug information and use complex retrieval-or learning-based methods. In the end, even the most sophisticated approaches fail to retrieve duplicate bug reports in many cases, leaving the bug triagers to their own devices. We argue that these duplicate bug retrieval tools should be used interactively, allowing the users to reformulate the queries to refine the retrieval. With that in mind, we are proposing three query reformulation strategies that require the users to simply select from the bug report the description of the software's observed behavior and/or the bug title, and combine them to issue a new query. The paper reports an empirical evaluation of the reformulation strategies, using a basic duplicate retrieval technique, on bug reports with duplicates from 20 open source projects. The duplicate detector failed to retrieve duplicates in top 5-30 for a significant number of the bug reports (between 34% and 50%). We reformulated the queries for a sample of these bug reports and compared the results against the initial query. We found that using the observed behavior description, together with the title, leads to the best retrieval performance. Using only the title or only the observed behavior for reformulation is also better than retrieval with the initial query. The reformulation strategies lead to 56.6%-78% average retrieval improvement, over using the initial query only. © 2019 IEEE.
  • Item
    Automatic Software Summarization: The State of the Art
    (IEEE Computer Society) Moreno, L.; Marcus, Andrian; 0000-0001-5450-5598 (Marcus, A); 58848719 (Marcus, A); Marcus, Andrian
    While automatic text summarization has been widely studied for more than fifty years, in software engineering, automatic summarization is an emerging area that shows great potential and poses new and exciting research challenges. This technical briefing provides an introduction to the state of the art and maps future research directions in automatic software summarization. A first version was presented at ICSE'17 and now it is updated and enhanced, based on feedback from the audience.

Works in Treasures @ UT Dallas are made available exclusively for educational purposes such as research or instruction. Literary rights, including copyright for published works held by the creator(s) or their heirs, or other third parties may apply. All rights are reserved unless otherwise indicated by the copyright owner(s).