Understanding Open Source Software Development Process The open source software “movement” has received enormous attention in the last 15 years. It is often characterized as a fundamentally new way to develop software systems. It is remarkable that large numbers of people (who may not know each other) manage to work together successfully to create high-quality, widely-used products. In this project, you will perform an empirical study on a number of real-world open source projects. You will collect software engineering data about these open source projects and investigate the process of open source software development, in particular: Q1: How many people wrote code for new functionality? How many people reported problems? How many people repaired defects? Q2: Did large numbers of people participate somewhat equally in these activities, or did a small number of people do most of the work? Q3: Was strict code ownership enforced on a file or module level? Q4: What is the defect density of open source software you studied? Q5: How long did it take to resolve bugs? Open Source Projects: In this course project, you will analyze any two of the following open source projects: Eclipse Mozilla Firefox Linux Kernel Gnome Apache HTTP Server Any software projects developed by your company The detailed descriptions of these projects can be obtained via the Internet. For example, the Mozilla Firefox website is at: https://www.mozilla.org/ Data Source: You need to collect software engineering data about these projects from the following sources: Project website: you could know more about the project from the project website. The bug tracking system: a bug tracking system such as BugZilla allows individual or groups of developers to keep track of bugs in their products effectively. For example, the bug tracking system of Mozilla is: https://bugzilla.mozilla.org/ . From the bug tracking system, you can collect bug-related data such as bug id, the package/component in which the bug is found, version, open date, close date, status, resolution, severity, priority, reporter, assignee, full summary, etc. Source code repository: this is the version control system that manages source code and changes. Examples include CVS, SVN, or Git. From the source code repository, you can collect software change data, which includes the number of added/deleted lines of code, the number of added/modified files, the committers, the commit time, the commit logs, etc. Developer forum/mailing list: you can browse developer forum or join the developers’ mailing list to read the archived discussions and messages. Any other sources, such as project documentation, product web pages, etc. References: A. Mockus, R. T. Fielding, and J. Herbsleb. Two case studies of open source software development: Apache and mozilla. ACM Transactions on Software Engineering and Methodology, 11(3):1.38, July 2002. D. German and A. Mockus, Automating the Measurement of Open Source Projects, in Proceedings of the 3rd Workshop on Open Source Software Engineering, 2003. A. Schröter, T. Zimmermann, R. Premraj, and A. Zeller, "If your bug database could talk..." in Proceedings of the 5th International Symposium on Empirical Software Engineering. Volume II: Short Papers and Posters, 2006, pp. 18-20. Raymond, E. S., The cathedral and the bazaar,1999.