A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

  Subjects -> ELECTRONICS (Total: 207 journals)
The end of the list has been reached or no journals were found for your choice.
Similar Journals
Journal Cover
IEEE Transactions on Software Engineering
Journal Prestige (SJR): 0.548
Citation Impact (citeScore): 5
Number of Followers: 84  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 0098-5589
Published by IEEE Homepage  [228 journals]
  • Interacto: A Modern User Interaction Processing Model

    • Free pre-print version: Loading...

      Authors: Arnaud Blouin;Jean-Marc Jézéquel;
      Pages: 3206 - 3226
      Abstract: Since most software systems provide their users with interactive features, building user interfaces (UI) is one of the core software engineering tasks. It consists in designing, implementing and testing ever more sophisticated and versatile ways for users to interact with software systems, and safely connecting these interactions with commands querying or modifying their state. However, most UI frameworks still rely on a low level model, the bare bone UI event processing model. This model was suitable for the rather simple UIs of the early 80's (menus, buttons, keyboards, mouse clicks), but now exhibits major software engineering flaws for modern, highly interactive UIs. These flaws include lack of separation of concerns, weak modularity and thus low reusability of code for advanced interactions, as well as low testability. To mitigate these flaws, we propose Interacto as a high level user interaction processing model. By reifying the concept of user interaction, Interacto makes it easy to design, implement and test modular and reusable advanced user interactions, and to connect them to commands with built-in undo/redo support. To demonstrate its applicability and generality, we briefly present two open source implementations of Interacto for Java/JavaFX and TypeScript/Angular. We evaluate Interacto interest (1) on a real world case study, where it has been used since 2013, and with (2) a controlled experiment with 44 master students, comparing it with traditionnal UI frameworks.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • LogAssist: Assisting Log Analysis Through Log Summarization

    • Free pre-print version: Loading...

      Authors: Steven Locke;Heng Li;Tse-Hsun Peter Chen;Weiyi Shang;Wei Liu;
      Pages: 3227 - 3241
      Abstract: Logs contain valuable information about the runtime behaviors of software systems. Thus, practitioners rely on logs for various tasks such as debugging, system comprehension, and anomaly detection. However, logs are difficult to analyze due to their unstructured nature and large size. In this paper, we propose a novel approach called LogAssist that assists practitioners with log analysis. LogAssist provides an organized and concise view of logs by first grouping logs into event sequences (i.e., workflows), which better illustrate the system runtime execution paths. Then, LogAssist compresses the log events in workflows by hiding consecutive events and applying n-gram modeling to identify common event sequences. We evaluated LogAssist on logs generated by one enterprise and two open source systems. We find that LogAssist can reduce the number of log events that practitioners need to investigate by up to 99 percent. Through a user study with 19 participants, we find that LogAssist can assist practitioners by reducing the time required for log analysis tasks by an average of 40 percent. The participants also rated LogAssist an average of 4.53 out of 5 for improving their experiences of performing log analysis. Finally, we document our experiences and lessons learned from developing and adopting LogAssist in practice. We believe that LogAssist and our reported experiences may lay the basis for future analysis and interactive exploration on logs.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Broken External Links on Stack Overflow

    • Free pre-print version: Loading...

      Authors: Jiakun Liu;Xin Xia;David Lo;Haoxiang Zhang;Ying Zou;Ahmed E. Hassan;Shanping Li;
      Pages: 3242 - 3267
      Abstract: Stack Overflow hosts valuable programming-related knowledge with 11,926,354 links that reference to the third-party websites. The links that reference to the resources hosted outside the Stack Overflow websites extend the Stack Overflow knowledge base substantially. However, with the rapid development of programming-related knowledge, many resources hosted on the Internet are not available anymore. Based on our analysis of the Stack Overflow data that was released on Jun. 2, 2019, 14.2 percent of the links on Stack Overflow are broken links. The broken links on Stack Overflow can obstruct viewers from obtaining desired programming-related knowledge, and potentially damage the reputation of the Stack Overflow as viewers might regard the posts with broken links as obsolete. In this paper, we characterize the broken links on Stack Overflow. 65 percent of the broken links in our sampled questions are used to show examples, e.g., code examples. 70 percent of the broken links in our sampled answers are used to provide supporting information, e.g., explaining a certain concept and describing a step to solve a problem. Only 1.67 percent of the posts with broken links are highlighted as such by viewers in the posts’ comments. Only 5.8 percent of the posts with broken links removed the broken links. Viewers cannot fully rely on the vote scores to detect broken links, as broken links are common across posts with different vote scores. The websites that host resources that can be maintained by their users are referenced by broken links the most on Stack Overflow – a prominent example of such websites is GitHub. The posts and comments related to the web technologies, i.e., JavaScript, HTML, CSS, and jQuery, are associated with more broken links. Based on our findings, we shed lights for future directions and provide recommendations for practitioners and researchers.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Generating Unit Tests for Documentation

    • Free pre-print version: Loading...

      Authors: Mathieu Nassif;Alexa Hernandez;Ashvitha Sridharan;Martin P. Robillard;
      Pages: 3268 - 3279
      Abstract: Software projects capture redundant information in various kinds of artifacts, as specifications from the source code are also tested and documented. Such redundancy provides an opportunity to reduce development effort by supporting the joint generation of different types of artifacts. We introduce a tool-supported technique, called DScribe, that allows developers to combine unit test and documentation templates, and to invoke these templates to generate documentation and unit tests. DScribe supports the detection and replacement of outdated documentation, and the use of templates can encourage extensive test suites with a consistent style. Our evaluation of 835 specifications revealed that 85 percent were not tested or correctly documented, and DScribe could be used to automatically generate 97 percent of the tests and documentation. An additional study revealed that tests generated by DScribe are more focused and readable than those written by human testers or generated by state-of-the-art automated techniques.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Deep Learning Based Vulnerability Detection: Are We There Yet'

    • Free pre-print version: Loading...

      Authors: Saikat Chakraborty;Rahul Krishna;Yangruibo Ding;Baishakhi Ray;
      Pages: 3280 - 3296
      Abstract: Automated detection of software vulnerabilities is a fundamental problem in software security. Existing program analysis techniques either suffer from high false positives or false negatives. Recent progress in Deep Learning (DL) has resulted in a surge of interest in applying DL for automated vulnerability detection. Several recent studies have demonstrated promising results achieving an accuracy of up to 95 percent at detecting vulnerabilities. In this paper, we ask, “how well do the state-of-the-art DL-based techniques perform in a real-world vulnerability prediction scenario'” To our surprise, we find that their performance drops by more than 50 percent. A systematic investigation of what causes such precipitous performance drop reveals that existing DL-based vulnerability prediction approaches suffer from challenges with the training data (e.g., data duplication, unrealistic distribution of vulnerable classes, etc.) and with the model choices (e.g., simple token-based models). As a result, these approaches often do not learn features related to the actual cause of the vulnerabilities. Instead, they learn unrelated artifacts from the dataset (e.g., specific variable/function names, etc.). Leveraging these empirical findings, we demonstrate how a more principled approach to data collection and model design, based on realistic settings of vulnerability prediction, can lead to better solutions. The resulting tools perform significantly better than the studied baseline—up to 33.57 percent boost in precision and 128.38 percent boost in recall compared to the best performing model in the literature. Overall, this paper elucidates existing DL-based vulnerability prediction systems’ potential issues and draws a roadmap for future DL-based vulnerability prediction research.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • The Ghost Commit Problem When Identifying Fix-Inducing Changes: An
           Empirical Study of Apache Projects

    • Free pre-print version: Loading...

      Authors: Christophe Rezk;Yasutaka Kamei;Shane McIntosh;
      Pages: 3297 - 3309
      Abstract: The SZZ approach for identifying fix-inducing changes traces backwards from a commit that fixes a defect to those commits that are implicated in the fix. This approach is at the heart of studies of characteristics of fix-inducing changes, as well as the popular Just-in-Time (JIT) variant of defect prediction. However, some types of commits are invisible to the SZZ approach. We refer to these invisible commits as “Ghost Commits.” In this paper, we set out to define, quantify, characterize, and mitigate ghost commits that impact the SZZ algorithm during its mapping (i.e., linking defect-fixing commits to those commits that are implicated by the fix) and filtering phases (i.e., removing improbable fix-inducing commits from the set of implicated commits). We mine the version control repositories of 14 open source Apache projects for instances of mapping-phase and filtering-phase ghost commits. We find that (1) 5.66–11.72 percent of defect-fixing commits of defect-fixing commits only add lines, and thus, cannot be mapped back to implicated commits; (2) 1.05–4.60 percent of the studied commits only remove lines, and thus, cannot be implicated in future fixes; and (3) that no implicated commits survive the filtering process of 0.35–14.49 percent defect-fixing commits. Qualitative analysis of ghost commits reveals that 46.5 percent of 142 addition-only defect-fixing commits add checks (e.g., null-ness or emptiness checks), while 39.7 percent of 307 removal-only commits clean up (unused) code. Our results suggest that the next generation of SZZ improvements should be language-aware to connect ghost commits to implicated and defect-fixing commits. Based on our observations, we discuss promising directions for mitigation strategies to address each type of ghost commit. Moreover, we implement mitigation strategies for addition-only commits and evaluate those s-rategies with respect to a baseline approach. The results indicate that our strategies achieve a precision of 0.753, improving the precision of implicated commits by 39.5 percentage points.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • A Study About the Knowledge and Use of Requirements Engineering Standards
           in Industry

    • Free pre-print version: Loading...

      Authors: Xavier Franch;Martin Glinz;Daniel Mendez;Norbert Seyff;
      Pages: 3310 - 3325
      Abstract: Context. The use of standards is considered a vital part of any engineering discipline. So one could expect that standards play an important role in Requirements Engineering (RE) as well. However, little is known about the actual knowledge and use of RE-related standards in industry. Objective. In this article, we investigate to which extent standards and related artifacts such as templates or guidelines are known and used by RE practitioners. Method. To this end, we have conducted a questionnaire-based online survey. We could analyze the replies from 90 RE practitioners using a combination of closed and open-text questions. Results. Our results indicate that the knowledge and use of standards and related artifacts in RE is less widespread than one might expect from an engineering perspective. For example, about 47% of the respondents working as requirements engineers or business analysts do not know the core standard in RE, ISO/IEC/IEEE 29148. Participants in our study mostly use standards by personal decision rather than being imposed by their respective company, customer, or regulator. Beyond insufficient knowledge, we also found cultural and organizational factors impeding the widespread adoption of standards in RE. Conclusions. Overall, our results provide empirically informed insights into the actual use of standards and related artifacts in RE practice and – indirectly – about the value that the current standards create for RE practitioners.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Emotions and Perceived Productivity of Software Developers at the
           Workplace

    • Free pre-print version: Loading...

      Authors: Daniela Girardi;Filippo Lanubile;Nicole Novielli;Alexander Serebrenik;
      Pages: 3326 - 3341
      Abstract: Emotions are known to impact cognitive skills, thus influencing job performance. This is also true for software development, which requires creativity and problem-solving abilities. In this paper, we report the results of a field study involving professional developers from five different companies. We provide empirical evidence that a link exists between emotions and perceived productivity at the workplace. Furthermore, we present a taxonomy of triggers for developers’ positive and negative emotions, based on the qualitative analysis of participants’ self-reported answers collected through daily experience sampling. Finally, we experiment with a minimal set of non-invasive biometric sensors that we use as input for emotion detection. We found that positive emotional valence, neutral arousal, and high dominance are prevalent. We also found a positive correlation between emotional valence and perceived productivity, with a stronger correlation in the afternoon. Both social and individual breaks emerge as useful for restoring a positive mood. Furthermore, we found that a minimum set of non-invasive biometric sensors can be used as a predictor for emotions, provided that training is performed on an individual basis. While promising, our classifier performance is not yet robust enough for practical usage. Further data collection is required to strengthen the classifier, by also implementing individual fine-tuning of emotion models.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • A Deep Dive into the Impact of COVID-19 on Software Development

    • Free pre-print version: Loading...

      Authors: Paulo Anselmo da Mota Silveira Neto;Umme Ayda Mannan;Eduardo Santana de Almeida;Nachiappan Nagappan;David Lo;Pavneet Singh Kochhar;Cuiyun Gao;Iftekhar Ahmed;
      Pages: 3342 - 3360
      Abstract: The COVID-19 pandemic is considered as the most crucial global health calamity of the century. It has impacted different business sectors around the world and software development is not an exception. This study investigates the impact of COVID-19 on software projects and software development professionals. We conducted a mining software repository study based on 100 GitHub projects developed in Java using ten different metrics. Next, we surveyed 279 software development professionals for better understanding the impact of COVID-19 on daily activities and wellbeing. We identified 12 observations related to productivity, code quality, and wellbeing. Our findings highlight that the impact of COVID-19 is not binary (reduce productivity versus increase productivity) but rather a spectrum. For many of our observations, substantial proportions of respondents have differing opinions from each other. We believe that more research is needed to uncover specific conditions that cause certain outcomes to be more prevalent.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • An Experience Report on Producing Verifiable Builds for Large-Scale
           Commercial Systems

    • Free pre-print version: Loading...

      Authors: Yong Shi;Mingzhi Wen;Filipe R. Cogo;Boyuan Chen;Zhen Ming Jiang;
      Pages: 3361 - 3377
      Abstract: Build verifiability is a safety property for a software system which can be used to check against various security-related issues during the build process. In summary, a verifiable build generates equivalent build artifacts for every build instance, allowing independent auditors to verify that the generated artifacts correspond to their source code. Producing a verifiable build is a very challenging problem, as non-equivalences in the build artifacts can be caused by non-determinsm from the build environment, the build toolchain, or the system implementation. Existing research and practices on build verifiability mainly focus on remediating sources of non-determinism. However, such a process does not work well with large-scale commercial systems (LSCSs) due to their stringent security requirements, complex third party dependencies, and large volumes of code changes. In this paper, we present an experience report on using a unified process and a toolkit to produce verifiable builds for LSCSs. A unified process contrasts with the existing practices in which recommendations to mitigate sources of non-determinism are proposed on a case-by-case basis and are not codified in a comprehensive tool. Our approach supports the following three strategies to systematically mitigate non-equivalences in the build artifacts: remediation, controlling, and interpretation. Case study on three LSCSs within ${{sf Huawei}}$Huawei shows that our approach is able to increase the proportion of verified build artifacts from less than 50 to 100 percent. To cross-validate our approach, we successfully applied our approach to build 2,218 open source packages distributed under ${{sf CentOS}}$CentOS 7.8, increasing the proportion of verified build artifacts from 85 to 99 percent with minimal human intervention. We also provide an overview of our mitigation guideline, which describes the recommended strategies to mitigate various non-equivalences. Finally, we present some discussions and open research problems in this area based on our experience and lessons learned in the past few years of applying our approach within the company. This paper will be useful for practitioners and software engineering researchers who are interested in build verifiability.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • DevOps Research-Based Teaching Using Qualitative Research and Inter-Coder
           Agreement

    • Free pre-print version: Loading...

      Authors: Jorge E. Pérez;Ángel González-Prieto;Jessica Dı´az;Daniel López-Fernández;Javier García-Martín;Agustín Yagüe;
      Pages: 3378 - 3393
      Abstract: DevOps is becoming a main competency required by the software industry. However, academic institutions have been slow to provide DevOps training in software engineering (SE) curricula. One reason for this is the fact that the problems addressed by DevOps may be hard to understand to students who have not previously worked in the industry or on projects of meaningful size and complexity. This paper shows an experience that integrates DevOps in SE curricula through research-based teaching (RBT). We aim to expose students to the problems that have led companies to adopt DevOps by researching and analyzing real cases of companies, thereby placing students at the center of learning. The contribution of this work is to innovate the application of RBT in software engineering by showing that the RBT approach is, at least, as good as the traditional approach and that it also leads to some extra benefits. This innovative solution has been implemented by using (i) qualitative analysis, specifically coding techniques, to discover knowledge and (ii) inter-coder agreement (ICA), specifically Krippendorff’s $alpha$α coefficients, to measure the extent of students’ learning. These techniques allow teachers to determine whether students’ learning in the subject is homogeneous and to analyze disagreements among students during their analysis. This approach provides teachers with new tools (Krippendorff’s $alpha$α coefficients) to identify those concepts that are less understood by students and to-evaluate whether improvements in the research instruments (e.g., the codebook used in the qualitative analysis) also generate improvements in the students’ agreement. This RBT experience shows evidence that can be used to assess whether a similar experience and the use of ICA could be applied in similar learning contexts with similar research contexts.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Including Everyone, Everywhere: Understanding Opportunities and Challenges
           of Geographic Gender-Inclusion in OSS

    • Free pre-print version: Loading...

      Authors: Gede Artha Azriadi Prana;Denae Ford;Ayushi Rastogi;David Lo;Rahul Purandare;Nachiappan Nagappan;
      Pages: 3394 - 3409
      Abstract: The gender gap is a significant concern facing the software industry as the development becomes more geographically distributed. Widely shared reports indicate that gender differences may be specific to each region. However, how complete can these reports be with little to no research reflective of the Open Source Software (OSS) process and communities software is now commonly developed in' Our study presents a multi-region geographical analysis of gender inclusion on GitHub. This mixed-methods approach includes quantitatively investigating differences in gender inclusion in projects across geographic regions and investigate these trends over time using data from contributions to 21,456 project repositories. We also qualitatively understand the unique experiences of developers contributing to these projects through a survey that is strategically targeted to developers in various regions worldwide. Our findings indicate that gender diversity is low across all parts of the world, with no substantial difference across regions. However, there has been statistically significant improvement in diversity worldwide since 2014, with certain regions such as Africa improving at faster pace. We also find that most motivations and barriers to contributions (e.g., lack of resources to contribute and poor working environment) were shared across regions, however, some insightful differences, such as how to make projects more inclusive, did arise. From these findings, we derive and present implications for tools that can foster inclusion in open source software communities and empower contributions from everyone, everywhere.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Assisting Example-Based API Misuse Detection via Complementary Artificial
           Examples

    • Free pre-print version: Loading...

      Authors: Maxime Lamothe;Heng Li;Weiyi Shang;
      Pages: 3410 - 3422
      Abstract: Application Programming Interfaces (APIs) allow their users to reuse existing software functionality without implementing it by themselves. However, using external functionality can come at a cost. Because developers are decoupled from the API’s inner workings, they face the possibility of misunderstanding, and therefore misusing APIs. Prior research has proposed state-of-the-art example-based API misuse detectors that rely on existing API usage examples mined from existing code bases. Intuitively, without a varied dataset of API usage examples, it is challenging for the example-based API misuse detectors to differentiate between infrequent but correct API usages and API misuses. Such mistakes lead to false positives in the API misuse detection results, which was reported in a recent study as a major limitation of the state-of-the-art. To tackle this challenge, in this paper, we first undertake a qualitative study of 384 falsely detected API misuses. We find that around one third of the false-positives are due to missing alternative correct API usage examples. Based on the knowledge gained from the qualitative study, we uncover five patterns which can be followed to generate artificial examples for complementing existing API usage examples in the API misuse detection. To evaluate the usefulness of the generated artificial examples, we apply a state-of-the-art example-based API misuse detector on 50 open source Java projects. We find that our artificial examples can complement the existing API usage examples by preventing the detection of 55 false API misuses. Furthermore, we conduct a pre-designed experiment in an automated API misuse detection benchmark (MUBench), in order to evaluate the impact of generated artificial examples on recall. We find that the API misuse detector covers the same true positive results with and without the artificial example, i.e., obtains the same recall of 94.7 percent. Our findings highlight the potential of improvi-g API misuse detection by pattern-guided source code transformation techniques.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Post2Vec: Learning Distributed Representations of Stack Overflow Posts

    • Free pre-print version: Loading...

      Authors: Bowen Xu;Thong Hoang;Abhishek Sharma;Chengran Yang;Xin Xia;David Lo;
      Pages: 3423 - 3441
      Abstract: Past studies have proposed solutions that analyze Stack Overflow content to help users find desired information or aid various downstream software engineering tasks. A common step performed by those solutions is to extract suitable representations of posts; typically, in the form of meaningful vectors. These vectors are then used for different tasks, for example, tag recommendation, relatedness prediction, post classification, and API recommendation. Intuitively, the quality of the vector representations of posts determines the effectiveness of the solutions in performing the respective tasks. In this work, to aid existing studies that analyze Stack Overflow posts, we propose a specialized deep learning architecture Post2Vec which extracts distributed representations of Stack Overflow posts. Post2Vec is aware of different types of content present in Stack Overflow posts, i.e., title, description, and code snippets, and integrates them seamlessly to learn post representations. Tags provided by Stack Overflow users that serve as a common vocabulary that captures the semantics of posts are used to guide Post2Vec in its task. To evaluate the quality of Post2Vec's deep learning architecture, we first investigate its end-to-end effectiveness in tag recommendation task. The results are compared to those of state-of-the-art tag recommendation approaches that also employ deep neural networks. We observe that Post2Vec achieves 15-25 percent improvement in terms of F1-score@5 at a lower computational cost. Moreover, to evaluate the value of representations learned by Post2Vec, we use them for three other tasks, i.e., relatedness prediction, post classification, and API recommendation. We demonstrate that the representations can be used to boost the effectiveness of state-of-the-art solutions for the three tasks by substantial margins (by 10, 7, and 10 percent in terms of F1-score, F1-score, and correctness, respectively). We release our replicat-on package at https://github.com/maxxbw/Post2Vec.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • BinDiffNN : Learning Distributed Representation of Assembly for Robust
           Binary Diffing Against Semantic Differences

    • Free pre-print version: Loading...

      Authors: Sami Ullah;Heekuck Oh;
      Pages: 3442 - 3466
      Abstract: Binary diffing is a process to discover the differences and similarities in functionality between two binary programs. Previous research on binary diffing approaches it as a function matching problem to formulate an initial 1:1 mapping between functions, and later a sequence matching ratio is computed to classify two functions being an exact match, a partial match or no-match. The accuracy of existing techniques is best only when detecting exact matches and they are not efficient in detecting partially changed functions; especially those with minor patches. These drawbacks are due to two major challenges (i) In the 1:1 mapping phase, using a strict policy to match function features (ii) In the classification phase, considering an assembly snippet as a normal text, and using sequence matching for similarity comparison. Instruction has a unique structure i.e. mnemonics and registers have a specific position in instruction and also have a semantic relationship, which makes assembly code different from general text. Sequence matching performs best for general text but it fails to detect structural and semantic changes at an instruction level thus, its use for classification produces many false results. In this research, we have addressed the aforementioned underlying challenges by proposing a two-fold solution. For the 1:1 mapping phase, we have proposed computationally inexpensive features, which are compared with distance-based selection criteria to map similar functions and filter unmatched functions. For the classification phase, we have proposed a Siamese binary-classification neural network where each branch is an attention-based distributed learning embedding neural network — that learn the semantic similarity among assembly instructions, learn to highlight the changes at an instruction level and a final stage fully connected layer learn to accurately classify two -:1 mapped function either an exact or a partial match. We have used x86 kernel binaries for training and achieved $sim 99%$∼99% classification accuracy; which is higher than existing binary diffing techniques and tools.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • “I just looked for the solution!”On Integrating Security-Relevant
           Information in Non-Security API Documentation to Support Secure Coding
           Practices

    • Free pre-print version: Loading...

      Authors: Peter Leo Gorski;Sebastian Möller;Stephan Wiefling;Luigi Lo Iacono;
      Pages: 3467 - 3484
      Abstract: Software developers build complex systems using plenty of third-party libraries. Documentation is key to understand and use the functionality provided via the libraries’ APIs. Therefore, functionality is the main focus of contemporary API documentation, while cross-cutting concerns such as security are almost never considered at all, especially when the API itself does not provide security features. Documentations of JavaScript libraries for use in web applications, e.g., do not specify how to add or adapt a Content Security Policy (CSP) to mitigate content injection attacks like Cross-Site Scripting (XSS). This is unfortunate, as security-relevant API documentation might have an influence on secure coding practices and prevailing major vulnerabilities such as XSS. For the first time, we study the effects of integrating security-relevant information in non-security API documentation. For this purpose, we took CSP as an exemplary study object and extended the official Google Maps JavaScript API documentation with security-relevant CSP information in three distinct manners. Then, we evaluated the usage of these variations in a between-group eye-tracking lab study involving N=49 participants. Our observations suggest: (1) Developers are focused on elements with code examples. They mostly skim the documentation while searching for a quick solution to their programming task. This finding gives further evidence to results of related studies. (2) The location where CSP-related code examples are placed in non-security API documentation significantly impacts the time it takes to find this security-relevant information. In particular, the study results showed that the proximity to functional-related code examples in documentation is a decisive factor. (3) Examples significantly help to produce secure CSP solutions. (4) Developers have additional information needs that our approach cannot meet. Overall, our study contributes to a first understanding of the -mpact of security-relevant information in non-security API documentation on CSP implementation. Although further research is required, our findings emphasize that API producers should take responsibility for adequately documenting security aspects and thus supporting the sensibility and training of developers to implement secure systems. This responsibility also holds in seemingly non-security relevant contexts.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Trimmer: An Automated System for Configuration-Based Software
           Debloating

    • Free pre-print version: Loading...

      Authors: Aatira Anum Ahmad;Abdul Rafae Noor;Hashim Sharif;Usama Hameed;Shoaib Asif;Mubashir Anwar;Ashish Gehani;Fareed Zaffar;Junaid Haroon Siddiqui;
      Pages: 3485 - 3505
      Abstract: Software bloat has negative implications for security, reliability, and performance. To counter bloat, we propose Trimmer, a static analysis-based system for pruning unused functionality. Trimmer removes code that is unused with respect to user-provided command-line arguments and application-specific configuration files. Trimmer uses concrete memory tracking and a custom inter-procedural constant propagation analysis that facilitates dead code elimination. Our system supports both context-sensitive and context-insensitive constant propagation. We show that context-sensitive constant propagation is important for effective software pruning in most applications. We introduce sparse constant propagation that performs constant propagation only for configuration-hosting variables and show that it performs better (higher code size reductions) compared to constant propagation for all program variables. Overall, our results show that Trimmer reduces binary sizes for real-world programs with reasonable analysis times. Across 20 evaluated programs, we observe a mean binary size reduction of 22.7 percent and a maximum reduction of 62.7 percent. For 5 programs, we observe performance speedups ranging from 5 to 53 percent. Moreover, we show that winnowing software applications can reduce the program attack surface by removing code that contains exploitable vulnerabilities. We find that debloating using Trimmer removes CVEs in 4 applications.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • An Ensemble Approach for Annotating Source Code Identifiers With
           Part-of-Speech Tags

    • Free pre-print version: Loading...

      Authors: Christian D. Newman;Michael J. Decker;Reem S. Alsuhaibani;Anthony Peruma;Mohamed Wiem Mkaouer;Satyajit Mohapatra;Tejal Vishnoi;Marcos Zampieri;Timothy J. Sheldon;Emily Hill;
      Pages: 3506 - 3522
      Abstract: This paper presents an ensemble part-of-speech tagging approach for source code identifiers. Ensemble tagging is a technique that uses machine-learning and the output from multiple part-of-speech taggers to annotate natural language text at a higher quality than the part-of-speech taggers are able to obtain independently. Our ensemble uses three state-of-the-art part-of-speech taggers: SWUM, POSSE, and Stanford. We study the quality of the ensemble’s annotations on five different types of identifier names: function, class, attribute, parameter, and declaration statement at the level of both individual words and full identifier names. We also study and discuss the weaknesses of our tagger to promote the future amelioration of these problems through further research. Our results show that the ensemble achieves 75 percent accuracy at the identifier level and 84-86 percent accuracy at the word level. This is an increase of +17% points at the identifier level from the closest independent part-of-speech tagger.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • What Makes Agile Software Development Agile'

    • Free pre-print version: Loading...

      Authors: Marco Kuhrmann;Paolo Tell;Regina Hebig;Jil Klünder;Jürgen Münch;Oliver Linssen;Dietmar Pfahl;Michael Felderer;Christian R. Prause;Stephen G. MacDonell;Joyce Nakatumba-Nabende;David Raffo;Sarah Beecham;Eray Tüzün;Gustavo López;Nicolas Paez;Diego Fontdevila;Sherlock A. Licorish;Steffen Küpper;Günther Ruhe;Eric Knauss;Özden Özcan-Top;Paul Clarke;Fergal McCaffery;Marcela Genero;Aurora Vizcaino;Mario Piattini;Marcos Kalinowski;Tayana Conte;Rafael Prikladnicki;Stephan Krusche;Ahmet Coşkunçay;Ezequiel Scott;Fabio Calefato;Svetlana Pimonova;Rolf-Helge Pfeiffer;Ulrik Pagh Schultz;Rogardt Heldal;Masud Fazal-Baqaie;Craig Anslow;Maleknaz Nayebi;Kurt Schneider;Stefan Sauer;Dietmar Winkler;Stefan Biffl;Maria Cecilia Bastarrica;Ita Richardson;
      Pages: 3523 - 3539
      Abstract: Together with many success stories, promises such as the increase in production speed and the improvement in stakeholders’ collaboration have contributed to making agile a transformation in the software industry in which many companies want to take part. However, driven either by a natural and expected evolution or by contextual factors that challenge the adoption of agile methods as prescribed by their creator(s), software processes in practice mutate into hybrids over time. Are these still agile' In this article, we investigate the question: what makes a software development method agile' We present an empirical study grounded in a large-scale international survey that aims to identify software development methods and practices that improve or tame agility. Based on 556 data points, we analyze the perceived degree of agility in the implementation of standard project disciplines and its relation to used development methods and practices. Our findings suggest that only a small number of participants operate their projects in a purely traditional or agile manner (under 15 percent). That said, most project disciplines and most practices show a clear trend towards increasing degrees of agility. Compared to the methods used to develop software, the selection of practices has a stronger effect on the degree of agility of a given discipline. Finally, there are no methods or practices that explicitly guarantee or prevent agility. We conclude that agility cannot be defined solely at the process level. Additional factors need to be taken into account when trying to implement or improve agility in a software company. Finally, we discuss the field of software process-related research in the light of our findings and present a roadmap for future research.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Hashing Fuzzing: Introducing Input Diversity to Improve Crash Detection

    • Free pre-print version: Loading...

      Authors: Hector D. Menendez;David Clark;
      Pages: 3540 - 3553
      Abstract: The utility of a test set of program inputs is strongly influenced by its diversity and its size. Syntax coverage has become a standard proxy for diversity. Although more sophisticated measures exist, such as proximity of a sample to a uniform distribution, methods to use them tend to be type dependent. We use r-wise hash functions to create a novel, semantics preserving, testability transformation for C programs that we call HashFuzz. Use of HashFuzz improves the diversity of test sets produced by instrumentation-based fuzzers. We evaluate the effect of the HashFuzz transformation on eight programs from the Google Fuzzer Test Suite using four state-of-the-art fuzzers that have been widely used in previous research. We demonstrate pronounced improvements in the performance of the test sets for the transformed programs across all the fuzzers that we used. These include strong improvements in diversity in every case, maintenance or small improvement in branch coverage – up to 4.8 perent improvement in the best case, and significant improvement in unique crash detection numbers – between 28 to 97 perent increases compared to test sets for untransformed programs.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Testing Self-Adaptive Software With Probabilistic Guarantees on
           Performance Metrics: Extended and Comparative Results

    • Free pre-print version: Loading...

      Authors: Claudio Mandrioli;Martina Maggio;
      Pages: 3554 - 3572
      Abstract: This paper discusses methods to test the performance of the adaptation layer in a self-adaptive system. The problem is notoriously hard, due to the high degree of uncertainty and variability inherent in an adaptive software application. In particular, providing any type of formal guarantee for this problem is extremely difficult. In this paper we propose the use of a rigorous probabilistic approach to overcome the mentioned difficulties and provide probabilistic guarantees on the software performance. We describe the set up needed for the application of a probabilistic approach. We then discuss the traditional tools from statistics that could be applied to analyse the results, highlighting their limitations and motivating why they are unsuitable for the given problem. We propose the use of a novel tool – the Scenario Theory – to overcome said limitations. We conclude the paper with a thorough empirical evaluation of the proposed approach, using three adaptive software applications: the Tele-Assistance Service, the Self-Adaptive Video Encoder, and the Traffic Reconfiguration via Adaptive Participatory Planning. With the first, we empirically expose the trade-off between data collection and confidence in the testing campaign. With the second, we demonstrate how to compare different adaptation strategies. With the third, we discuss the role of the randomisation in the selection of test inputs. In the evaluation, we apply the scenario theory and also classical statistical tools: Monte Carlo and Extreme Value Theory. We provide a complete evaluation and a thorough comparison of the confidence and guarantees that can be given with all the approaches.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Factors Affecting On-Time Delivery in Large-Scale Agile Software
           Development

    • Free pre-print version: Loading...

      Authors: Elvan Kula;Eric Greuter;Arie van Deursen;Georgios Gousios;
      Pages: 3573 - 3592
      Abstract: Late delivery of software projects and cost overruns have been common problems in the software industry for decades. Both problems are manifestations of deficiencies in effort estimation during project planning. With software projects being complex socio-technical systems, a large pool of factors can affect effort estimation and on-time delivery. To identify the most relevant factors and their interactions affecting schedule deviations in large-scale agile software development, we conducted a mixed-methods case study at ING: two rounds of surveys revealed a multitude of organizational, people, process, project and technical factors which were then quantified and statistically modeled using software repository data from 185 teams. We find that factors such as requirements refinement, task dependencies, organizational alignment and organizational politics are perceived to have the greatest impact on on-time delivery, whereas proxy measures such as project size, number of dependencies, historical delivery performance and team familiarity can help explain a large degree of schedule deviations. We also discover hierarchical interactions among factors: organizational factors are perceived to interact with people factors, which in turn impact technical factors. We compose our findings in the form of a conceptual framework representing influential factors and their relationships to on-time delivery. Our results can help practitioners identify and manage delay risks in agile settings, can inform the design of automated tools to predict schedule overruns and can contribute towards the development of a relational theory of software project management.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Automatic Fairness Testing of Neural Classifiers Through Adversarial
           Sampling

    • Free pre-print version: Loading...

      Authors: Peixin Zhang;Jingyi Wang;Jun Sun;Xinyu Wang;Guoliang Dong;Xingen Wang;Ting Dai;Jin Song Dong;
      Pages: 3593 - 3612
      Abstract: Although deep learning has demonstrated astonishing performance in many applications, there are still concerns about its dependability. One desirable property of deep learning applications with societal impact is fairness (i.e., non-discrimination). Unfortunately, discrimination might be intrinsically embedded into the models due to the discrimination in the training data. As a countermeasure, fairness testing systemically identifies discriminatory samples, which can be used to retrain the model and improve the model’s fairness. Existing fairness testing approaches however have two major limitations. First, they only work well on traditional machine learning models and have poor performance (e.g., effectiveness and efficiency) on deep learning models. Second, they only work on simple structured (e.g., tabular) data and are not applicable for domains such as text. In this work, we bridge the gap by proposing a scalable and effective approach for systematically searching for discriminatory samples while extending existing fairness testing approaches to address a more challenging domain, i.e., text classification. Compared with state-of-the-art methods, our approach only employs lightweight procedures like gradient computation and clustering, which is significantly more scalable and effective. Experimental results show that on average, our approach explores the search space much more effectively (9.62 and 2.38 times more than the state-of-the-art methods respectively on tabular and text datasets) and generates much more discriminatory samples (24.95 and 2.68 times) within a same reasonable time. Moreover, the retrained models reduce discrimination by 57.2 and 60.2 percent respectively on average.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Identifying Challenges for OSS Vulnerability Scanners - A Study &
           Test Suite

    • Free pre-print version: Loading...

      Authors: Andreas Dann;Henrik Plate;Ben Hermann;Serena Elisa Ponta;Eric Bodden;
      Pages: 3613 - 3625
      Abstract: The use of vulnerable open-source dependencies is a known problem in today's software development. Several vulnerability scanners to detect known-vulnerable dependencies appeared in the last decade, however, there exists no case study investigating the impact of development practices, e.g., forking, patching, re-bundling, on their performance. This paper studies (i) types of modifications that may affect vulnerable open-source dependencies and (ii) their impact on the performance of vulnerability scanners. Through an empirical study on 7,024 Java projects developed at SAP, we identified four types of modifications: re-compilation, re-bundling, metadata-removal and re-packaging. In particular, we found that more than 87 percent (56 percent, resp.) of the vulnerable Java classes considered occur in Maven Central in re-bundled (re-packaged, resp.) form. We assessed the impact of these modifications on the performance of the open-source vulnerability scanners OWASP Dependency-Check (OWASP) and Eclipse Steady, GitHub Security Alerts, and three commercial scanners. The results show that none of the scanners is able to handle all the types of modifications identified. Finally, we present Achilles, a novel test suite with 2,505 test cases that allow replicating the modifications on open-source dependencies.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • What Drives and Sustains Self-Assignment in Agile Teams

    • Free pre-print version: Loading...

      Authors: Zainab Masood;Rashina Hoda;Kelly Blincoe;
      Pages: 3626 - 3639
      Abstract: Self-assignment, where software developers choose their own tasks, is a common practice in agile teams. However, it is not known why developers select certain tasks. It is important for managers to be aware of these reasons to ensure sustainable self-assignment practices. We investigated developers’ preferences while they are choosing tasks for themselves. We collected data from 42 participants working in 25 different software companies. We applied Grounded Theory procedures to study and analyse factors for self-assigning tasks, which we grouped into three categories: task-based, developer-based, and opinion-based. We found that developers have individual preferences and not all factors are important to every developer. Managers share some common and varying perspectives around the identified factors. Most managers want developers to give higher priority to certain factors. Developers often need to balance between task priority and their own individual preferences, and managers facilitate this through a variety of strategies. More risk-averse managers encourage expertise-based self-assignment to ensure tasks are completed quickly. Managers who are risk-balancing encourage developers to choose tasks that provide learning opportunities only when there is little risk of delays or reduced quality. Finally, growth-seeking managers regularly encourage team members to pick tasks outside their comfort zone to encourage growth opportunities. Our findings will help managers to understand what developers consider when self-assigning tasks and help them empower their teams to practice self-assignment in a sustainable manner.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Enhancing Dynamic Symbolic Execution by Automatically Learning Search
           Heuristics

    • Free pre-print version: Loading...

      Authors: Sooyoung Cha;Seongjoon Hong;Jiseong Bak;Jingyoung Kim;Junhee Lee;Hakjoo Oh;
      Pages: 3640 - 3663
      Abstract: We present a technique to automatically generate search heuristics for dynamic symbolic execution. A key challenge in dynamic symbolic execution is how to effectively explore the program's execution paths to achieve high code coverage in a limited time budget. Dynamic symbolic execution employs a search heuristic to address this challenge, which favors exploring particular types of paths that are most likely to maximize the final coverage. However, manually designing a good search heuristic is nontrivial and typically ends up with suboptimal and unstable outcomes. The goal of this paper is to overcome this shortcoming of dynamic symbolic execution by automatically learning search heuristics. We define a class of search heuristics, namely a parametric search heuristic, and present an algorithm that efficiently finds an optimal heuristic for each subject program. Experimental results with industrial-strength symbolic execution tools (e.g., KLEE) show that our technique can successfully generate search heuristics that significantly outperform existing manually-crafted heuristics in terms of branch coverage and bug-finding.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Combining Genetic Programming and Model Checking to Generate Environment
           Assumptions

    • Free pre-print version: Loading...

      Authors: Khouloud Gaaloul;Claudio Menghi;Shiva Nejati;Lionel C. Briand;Yago Isasi Parache;
      Pages: 3664 - 3685
      Abstract: Software verification may yield spurious failures when environment assumptions are not accounted for. Environment assumptions are the expectations that a system or a component makes about its operational environment and are often specified in terms of conditions over the inputs of that system or component. In this article, we propose an approach to automatically infer environment assumptions for Cyber-Physical Systems (CPS). Our approach improves the state-of-the-art in three different ways: First, we learn assumptions for complex CPS models involving signal and numeric variables; second, the learned assumptions include arithmetic expressions defined over multiple variables; third, we identify the trade-off between soundness and coverage of environment assumptions and demonstrate the flexibility of our approach in prioritizing either of these criteria. We evaluate our approach using a public domain benchmark of CPS models from Lockheed Martin and a component of a satellite control system from LuxSpace, a satellite system provider. The results show that our approach outperforms state-of-the-art techniques on learning assumptions for CPS models, and further, when applied to our industrial CPS model, our approach is able to learn assumptions that are sufficiently close to the assumptions manually developed by engineers to be of practical value.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
  • Detecting the Locations and Predicting the Maintenance Costs of Compound
           Architectural Debts

    • Free pre-print version: Loading...

      Authors: Lu Xiao;Yuanfang Cai;Rick Kazman;Ran Mo;Qiong Feng;
      Pages: 3686 - 3715
      Abstract: Architectural Technical Debt (ATD) refers to sub-optimal architectural design in a software system that incurs high maintenance “interest” over time. Previous research revealed that ATD has significant negative impact on daily development. This paper contributes an approach to enable an architect to precisely locate ATDs, as well as capture the trajectory of maintenance cost on each debt, based on which, predict the cost of the debt in a future release. The ATDs are expressed in four typical patterns, which entail the core of each debt. Furthermore, we aggregate compound ATDs to capture the complicated relationship among multiple ATD instances, which should be examined together for effective refactoring solutions. We evaluate our approach on 18 real-world projects. We identified ATDs that persistently incur significant (up to 95 percent of) maintenance costs in most projects. The maintenance costs on the majority of debts fit into a linear regression model—indicating stable “interest” rate. In five projects, 12.1 to 27.6 percent of debts fit into an exponential model, indicating increasing “interest” rate, which deserve higher priority from architects. The regression models can accurately predict the costs of the majority of (82 to 100 percent) debts in the next release of a system. By aggregating related ATDs, architects can focus on a small number of cost-effective compound debts, which contain a relatively small number of source files, but account for a large portion of maintenance costs in their projects. With these capabilities, our approach can help architects make informed decisions regarding whether, where, and how to refactor for eliminating ATDs in their sys-ems.
      PubDate: Sept. 1 2022
      Issue No: Vol. 48, No. 9 (2022)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 44.192.26.60
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-