University of Saskatchewan - Computer Science
Assistant Professor
Teaching + Research in Software Engineering
Professor
Teaching + Research in Software Engineering
Associate Professor
Teaching + Research in Software Engineering
BSc in CSE
Bachelor of Science in Computer Science and Engineering
PhD
Computer Software Engineering
MSc
Software Systems Engineering
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010)
Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010)
Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.
18th IEEE Working Conference on Reverse Engineering (WCRE 2011)
Software development today has been largely dependent on the use of API libraries, frameworks, and reusable components. However, the API usability issues often increase the development cost (e.g., time, effort) and lower code quality. In this regard, we study 1,513 bug-posts across five different bug repositories, using both qualitative and quantitative analysis. We identify the API usability issues that are reflected in the bugposts from the API users, and distinguish relative significance of the usability factors. Moreover, from the lessons learned by manual investigation of the bug-posts, we provide further insight into the most frequent API usability issues.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010)
Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.
18th IEEE Working Conference on Reverse Engineering (WCRE 2011)
Software development today has been largely dependent on the use of API libraries, frameworks, and reusable components. However, the API usability issues often increase the development cost (e.g., time, effort) and lower code quality. In this regard, we study 1,513 bug-posts across five different bug repositories, using both qualitative and quantitative analysis. We identify the API usability issues that are reflected in the bugposts from the API users, and distinguish relative significance of the usability factors. Moreover, from the lessons learned by manual investigation of the bug-posts, we provide further insight into the most frequent API usability issues.
11th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2011)
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010)
Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.
18th IEEE Working Conference on Reverse Engineering (WCRE 2011)
Software development today has been largely dependent on the use of API libraries, frameworks, and reusable components. However, the API usability issues often increase the development cost (e.g., time, effort) and lower code quality. In this regard, we study 1,513 bug-posts across five different bug repositories, using both qualitative and quantitative analysis. We identify the API usability issues that are reflected in the bugposts from the API users, and distinguish relative significance of the usability factors. Moreover, from the lessons learned by manual investigation of the bug-posts, we provide further insight into the most frequent API usability issues.
11th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2011)
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
5th International Workshop on Software Clones (IWSC 2011)
In this paper, we propose an IDE-based clone management system to flexibly detect, manage, and refactor both exact and near-miss code clones. Using a k-difference hybrid suffix tree algorithm we can efficiently detect both exact and near-miss clones. We have implemented the algorithm as a plugin to the Eclipse IDE, and have been extending this for real-time code clone management with semi-automated refactoring support during the actual development process.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010)
Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.
18th IEEE Working Conference on Reverse Engineering (WCRE 2011)
Software development today has been largely dependent on the use of API libraries, frameworks, and reusable components. However, the API usability issues often increase the development cost (e.g., time, effort) and lower code quality. In this regard, we study 1,513 bug-posts across five different bug repositories, using both qualitative and quantitative analysis. We identify the API usability issues that are reflected in the bugposts from the API users, and distinguish relative significance of the usability factors. Moreover, from the lessons learned by manual investigation of the bug-posts, we provide further insight into the most frequent API usability issues.
11th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2011)
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
5th International Workshop on Software Clones (IWSC 2011)
In this paper, we propose an IDE-based clone management system to flexibly detect, manage, and refactor both exact and near-miss code clones. Using a k-difference hybrid suffix tree algorithm we can efficiently detect both exact and near-miss clones. We have implemented the algorithm as a plugin to the Eclipse IDE, and have been extending this for real-time code clone management with semi-automated refactoring support during the actual development process.
Technical Report 2012-03, Department of Computer Science, The University of Saskatchewan
Duplicated code or code clones are a kind of code smell that have both positive and negative impact on the development and maintenance of software systems. Software clone research in the past mostly focused on the detection and analysis of code clones, while the research in recent years indicates that clone manage- ment is taking the center stage due to its pragmatic importance. Over more than a decade of research on software clones, notably three surveys appeared in the literature, which cover the detection, analysis, and evolutionary characteristics of code clones. This paper presents a comprehensive survey on the state of the art in clone management, with in-depth investigation of clone management activities (e.g., tracing, refactor- ing, cost-benefit analysis) beyond the detection and analysis. This is the first survey on clone management, where we point to the achievements so far, and reveal avenues for further research necessary towards an integrated clone management system.
16th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS 2011)
Effort for development and maintenance of complex large software is believed to have dependency on the amount of duplicated code fragments (code clones) present in codebases. For example, clones need to be carefully and consistently maintained and/or refactored for preventing accidental error propagation. Thus it is important to understand the proportion and evolution of clones in evolving software systems for cost estimation or the like. This paper presents a study on the evolution of near-miss clones at release level in medium to large open source software systems of different types (operating systems, database systems, editors, etc.) written in three different programming languages namely C, C#, and Java. Using a hybrid clone detector, NiCad, we detected both exact and near-miss clones at different levels of similarity. Applying statistical methods we investigated, from different dimensions, the evolution of both exact and nearmiss clones, and also forecasted the amount of clones in future releases of the software systems. Our study offers significant insights into the existence and evolution of code clones and their relationships with programming language or paradigm and program size.
19th IEEE International Conference on Program Comprehension (ICPC 2011) - student symposium
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
10th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2010)
Code clone genealogies show how clone groups evolve with the evolution of the associated software system, and thus could provide important insights on the maintenance implications of clones. In this paper, we provide an in-depth empirical study for evaluating clone genealogies in evolving open source systems at the release level. We develop a clone genealogy extractor, examine 17 open source C, Java, C++ and C# systems of diverse varieties and study different dimensions of how clone groups evolve with the evolution of the software systems. Our study shows that majority of the clone groups of the clone genealogies either propagate without any syntactic changes or change consistently in the subsequent releases, and that many of the genealogies remain alive during the evolution. These findings seem to be consistent with the findings of a previous study that clones may not be as detrimental in software maintenance as believed to be (at least by many of us), and that instead of aggressively refactoring clones, we should possibly focus on tracking and managing clones during the evolution of software systems.
18th IEEE Working Conference on Reverse Engineering (WCRE 2011)
Software development today has been largely dependent on the use of API libraries, frameworks, and reusable components. However, the API usability issues often increase the development cost (e.g., time, effort) and lower code quality. In this regard, we study 1,513 bug-posts across five different bug repositories, using both qualitative and quantitative analysis. We identify the API usability issues that are reflected in the bugposts from the API users, and distinguish relative significance of the usability factors. Moreover, from the lessons learned by manual investigation of the bug-posts, we provide further insight into the most frequent API usability issues.
11th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2011)
Duplicated code, also known as code clones, are one of the malicious ‘code smells’ that often need to be removed through refactoring for enhancing maintainability. Among all the potential refactoring opportunities, the choice and order of a set of refactoring activities may have distinguishable effect on the design/code quality. Moreover, there may be dependencies and conflicts among those refactorings. The organization may also impose priorities on certain refactoring activities. Addressing all these conflicts, priorities, and dependencies, manual formulation of an optimal refactoring schedule is very expensive, if not impossible. Therefore, an automated refactoring scheduler is necessary, which will maximize benefit and minimize refactoring effort. In this paper, we present a refactoring effort model, and propose a constraint programming approach for conflict-aware optimal scheduling of code clone refactoring.
5th International Workshop on Software Clones (IWSC 2011)
In this paper, we propose an IDE-based clone management system to flexibly detect, manage, and refactor both exact and near-miss code clones. Using a k-difference hybrid suffix tree algorithm we can efficiently detect both exact and near-miss clones. We have implemented the algorithm as a plugin to the Eclipse IDE, and have been extending this for real-time code clone management with semi-automated refactoring support during the actual development process.
Technical Report 2012-03, Department of Computer Science, The University of Saskatchewan
Duplicated code or code clones are a kind of code smell that have both positive and negative impact on the development and maintenance of software systems. Software clone research in the past mostly focused on the detection and analysis of code clones, while the research in recent years indicates that clone manage- ment is taking the center stage due to its pragmatic importance. Over more than a decade of research on software clones, notably three surveys appeared in the literature, which cover the detection, analysis, and evolutionary characteristics of code clones. This paper presents a comprehensive survey on the state of the art in clone management, with in-depth investigation of clone management activities (e.g., tracing, refactor- ing, cost-benefit analysis) beyond the detection and analysis. This is the first survey on clone management, where we point to the achievements so far, and reveal avenues for further research necessary towards an integrated clone management system.
27th ACM Symposium On Applied Computing (ACM SAC 2012, Software Engineering Track)
Code clone is a well-known code smell that needs to be detected and managed during the software development process. However, the existing clone detectors have one or more of the three shortcomings: (a) limitation in detecting Type- 3 clones, (b) they come as stand-alone tools separate from IDE and thus cannot support clone-aware development, (c) they overwhelm the developer with all clones from the entire code-base, instead of a focused search for clones of a selected code segment of the developer's interest. This paper presents our IDE-integrated clone search tool, that addresses all the above issues. For clone detection, we adapt a su?x-tree-based hybrid algorithm. Through an asymptotic analysis, we show that our approach for clone detection is both time and memory e?cient. Moreover, using three separate empirical studies, we demonstrate that our tool is flexibly usable for searching exact (Type-1 ) and near-miss (Type-2 and Type-3 ) clones with high precision and recall.