Design Document - October.23.2009
Introduction
We want to be able to parse (read) a file or a webpage or hopefully whatever format we want and be able to output it in a standard way so that the viewer can take this and show many different types of formats (like graph, chart, etc.).
Overview
When the user runs DataViz, the GUI displays a Frame. This Frame allows the user to select a data file to visualize. This data file is passed to a Parser which parses it and returns it to the Frame in the form of a DataSet, a data structure designed to house tables of data. This DataSet is passed the visualization. When the visualization requires information about the DataSet in order to preform the operations requested by the user, it passes the DataSet to the DataHandler. The DataHandler applies functions and filters that allow the visualization to display the data in the desired format. The visualization then passes the graphical view of the data to the Frame, which displays the final product to the user. As the user continues to interact with the Frame, the visualization may request further information from the DataHandler in order to update the display.
List of Classes/Files
- Main: Launches DataViz program
- Parser: Responsible for reading in the file given by the GUI and passing back a DataSet format that the program can work with
- Parser
- EDParser
- DelimiterParser
- DataSet
- Variable
- Data
- DataHandler: Holds all the filters and functions needed to act on the data when visualizing
- Viewer: The GUI view where the user will interact with DataViz and pick from visualizations that extend canvas
- Canvas
- Table
- Plot
- Bar Graph
- Frame
- FileMenu
- Graphical Settings
- Graphics Chooser Frame
- Utility: Utility classes provided by Duvall and in some cases modified for use in DataViz.
- Reflection
- Resource
- Command
- DataVizException
- FileCommand
- Reader
- Resources: Resource files for the GUI and for adding in extensions using reflection
- Displays.properties
- English.properties
- ExtensionParserMap.properties
- Filters.properties
- Test: Test Package to test the Parser and DataSet
- DataVizTest
- Test.resources/parsers.txt
- Data
- Templates
- API
- README
User Interface Design
We intend to keep the user interface of this program as simple as possible. The file menu has options to open and read in a text file of data, to switch between different ways to display the data, and to quit the program. The frame will have a Canvas to fill all the remaining space. The Canvas class extends JPanel and is abstract, so each display will extend this Canvas class. Any other user input is directly related to the type of display chosen, so that will be handled by these implementing classes. Table, Plot, and BarGraph are the three implementing classes, each of which shows the data in a unique way. Each has a PlotArea which extends JComponent and the paintComponent() method of this class defines how the data is drawn. Then, each has a separate internal class which creates a JPanel at the bottom of the Frame for user input. Table has a panel which is used to sort the data, Plot allows the user to choose which Variable is on each axis, and BarGraph does the same as plot plus allows the user to take the min, max, average, and standard deviation of the y-axis variable. Specifically, there are JComboBoxes to allow the user to select the variables to be plotted on each axis. Finally, the advantage to using this method is that the user never specifically enters data; the only erroneous input would be from a bad text file which is handled immediately.
To add visualizations, a new class must be created which extends Canvas in the same way defined previously. No modifications to pre-existing code must be made, only text files must be changed. An entry must be added to the English and Displays files which define two parameters. The requirements are that the same key is used in both files (the standard is __Command, i.e. PlotCommand), the entry in Displays is the name of the new class (for reflection), and the entry in English is the string that will appear in the file menu. No modification to the Frame is necessary, all entries in the Displays file will automatically be added to the FileMenu.
Design Details
Parser/Data
EDParser: Parser that works with files with a specific format, marked by the extension ".ed". The first line in these files is the title followed by a "#VARIABLES" section that reads off the column headings. A below the variable column indicates a key reference for that column. This is followed by a "#DATA" section that contains the actual data.
DeliminatedParser: Parser that works with data files that use delimiters to separate data. Use "," for .csv files and " " for .prn files as the delimiter. The first row is the heading row that contains all the variables this data keeps track of in the columns
Data: Contains an ArrayList which contains the values for one row of a data table. (Usually, the information for one individual).
Variable: Contains a map for integer respresentations of data to String meanings. Contains an overarching name that describes what this information represents. These are the labels for the individual columns of the data table.
DataSet: Contains an ArrayList of Variables and an ArrayList of Data. Elements of the Data list each represent a row of data (usually an individual). The index of a Variable in myVariables determines which index of each Data element it is linked to.
DataHandler: Contains functions and filters to modify data and get information from data. All functions work on or use Variables. These methods are static so that a DataHandler does not have to be created in order to use them. These filters/functions can then be utilized in the visualizations.
Reader: Extends FileCommand. Reads in a data file and creates a parser based on its Extension, using Reflection. The resulting DataSet is passed to the current display.
Parser: An abstract class extendable for different types of files. All Parsers must be able to read Data and Variables from their respective data files. Creating a Parser creates a DataSet with these Data and Variables.
GUI
Frame: This is created from the Main class and creates the main frame with a Menu and a Canvas.
FileMenu: This class extends JMenu and has a method to add a FileCommand and call the execute method of the command from actionperformed.
Canvas: Abstract class extending JPanel which is added to the main Frame as the primary display of output. It contains a field for the currently loaded DataSet which is used by Children of the class, and it has an embedded abstract class PlotArea which extends JComponent and must be implemented in subclasses.
Table, Plot and BarGraph: Classes which extend Canvas and show data in the format specified by the name of the class. Each may include ways for the user to interact with the data and alter the way it is displayed.
Reflection and Resources: These are utility packages which are included for reflection and for global resources, respectively.
Design Considerations
Parser Construction
We had an issue where we wanted to use reflection to select the parser but that meant that all the parsers had to have the same constructor. Eventually we chose to change the ed parser so that a delimiter would be passed in but never used.
Plotting Strings
We struggled for a long time trying to figure out how to differentiate between data that represented numbers vs. strings. Since the data was mixed, it would crash our plot and other programs that depended on numbers. Eventually we turned them all to numbers and utilized the field map for each variable.
DataSet vs. Data Handler
We initially wanted the functions like min, max, avg to be in data handler but after writing them, we realized taht it was basically like putting them in the DataSet class. Eventually we were able to refactor some of them out so that they are truly seperate from the data but it was a messy call to figure out which were functions that were necessary to the data set for it to return useful information and which functions were "filters" to help plot.
Remaining Issues
- Do not have units on the axes
- Drawing of labels on axes may overlap and be unreadable
Extensions
Extensions that we implemented:
- Allow users to customize a visualization via a color palette that swaps the colors displayed, with savable and loadable templates.
- Allow users to interact with the visualized data by changing the values displayed in the axes and applying different functions on the y-axis of bar graphs.
- When a user hovers over the exact point in plot, the coordinates show up in a pop up box.
Extensions that we were not able to implement but wanted to:
- Read from a web site (requires reading in html and creating a new subclass of parser to handle the html code)
- Support different output formats - save as image or html (creatig different writer subclasses of command)
- Allow users to customize a visualization via imported templates - A background to display behind the data
- Allow users to interact with the visualized data - add animation over time (requires updating plot with a variable that changes and a delay)
Team Responsibilities
Team "Unicorn" consists of Jimmy Sedlick, Megan Heysham, and Elizabeth Liang.
Jimmy will be primarily in charge of displaying data which involves creating the GUI. His secondary responsibility is to manage the repository and help write up the artifacts and design documents.
Megan will be primarily working on organizing the data once it is read in. Her secondary responsibility is testing which Liz will be helping her with.
Liz will be primarily working on the reading in of data for the project. Her secondary responsibilities include managing the website, documentation, taking notes during meetings, helping megan with testing, and emailing the TA.
Task List & Estimated Time Usage
- Reading in Data
- Reading multiple file types & making Reader choose correct parser
- Constructor of an abstract class shouldn't call abstract methods
- Using reflection in Reader to make Reader closed
- Organizing Data
- Handle data types better - able to plot strings
- Functions like average and min/max, maybe std. Deviation
- Doubles instead of ints
- Range Function
- Sort the data by column
- Function that sorts different variable fields into a map
- Generalized functions package where you only input the variable and the data it should act on
- Finals should pull from resource file
- Displaying Data
- Restrict views depending on data type (ie. If a file has string data, don't allow it to Plot)
- Add different views: BarGraph, etc.
- Popup message on Exceptions
- Displaying field strings
- Being able to plot the min, max, avg, maybe percents as well so that you can compare scale and have two seperate plots on the same graph
- Make DataVizException pull from ResourceManager
- Testing and Documentation
- API
- Project Artifact
- Comments/Java Doc/@author tags
- Deleting unused classes and code
- Testing
- README
- Design Document
- Meetings
Total Estimated Time: 50 hours
Progress Log & Timeline
Wednesday: Meet in class, introduced ourselves, set a meeting time for thursday and talked about things we need to do for the project. .5 hours
Thursday: Met to talk about roles and timeline and website for 1 hour. Liz worked on building the webpage template for 1.5 hours. 2.5 hours
Friday: Liz finished putting the content up on the website. 1 hour
Project Webpage - Friday.October.9.2009
Saturday: Liz found irish data set, and read through nameSurfer program. 1 hour
Sunday: Liz created the parser class and associated components like the DataSet, Variable, Data classes. Formatted Irish.ed data to be read in. Added Main class, which creates a Frame. Frame adds a FileMenu which uses Command hierarchy for actions (just Reader and Writer for now, only Reader is implemented) Frame also adds a Canvas. Canvas is abstract and is extended by whatever display we want (Plot and BarGraph for now, neither is implemented yet). Added a Table class which extends Canvas and shows data in tabular form. 6 hours
Monday: Jimmy created a plot class and liz created a parser that can read in csv files. Still some bugs to be worked out but at least the data is read in. Megan added some more functionality to the DataSet class. Changed Canvas to extend JPanel instead of JComponent. Now each class that extends Canvas will have an embedded JComponent class as the display. Added Plot which can plot any choice of variables on either axis. Added new Menu on the Frame which allows user to switch between Table and Plot, using reflection. Added reflection utilities. 7 hours
Tuesday: Liz commented for the parser package and created a parser hierarchy. Jimmy commented for the view package and worked more with the reader class. Everyone worked to type up the artifact. Data file now read into Reader, then passed to Canvas. Changing type of display no longer requires reloading the data. Plot updated with numbers at the end of the axes. ResourceManager added so resources can be read in from all files. 4 hours
Data API - Tuesday.October.13.2009
Wednesday: Megan worked on figuring things out and making the table display strings correctly. Began to write functions and eventually moved them from data handler to the DataSet. 2.5 hours
Thursday: Megan continued to work on functions and changed the integers to doubles throughout the program. 2.5 hours
Friday: Jimmy created FontChooserPanel, and plot is scrollable now. 1.5 hours
Saturday: Jimmy and Megan created an API file in the project.
Revised Data API - Saturday.October.17.2009
Sunday: Jimmy added templates. 1.5 hours
Monday: Megan worked on fixing some of the functions and the team revised the API in class. Jimmy worked on code so that hovering over a point in plot now opens a frame which shows the coordinate. 3.5 hours
Tuesday: Megan wrote getNumericalVariable that returns the variable if it contains only numerical data and returns null otherwise. Tried to get Plot to only show the non-null ones, but then 1900 doesn't work and claims to have non-numerical data. Liz worked on writing separate by variable function. 2 hours
Wednesday: Megan worked on exception handling and displaying and Liz worked on moving variable functions to the Data Handler class and modifying the code correctly. Jimmy implemented bar graph. 4 hours
Thursday: Liz commented the parser and data handler classes and worked on the API. Jimmy wrote code so that you can now sort data in Table. Frame also updated to use reflection to select Canvas 4 hours
Friday: Megan worked on testing and helping type up the documentation (API/Design Document/Commenting/etc.) and Liz wrote a method to turn non double data into numbers by utilizing the field map for variables. This allowed all the data to be plotted. She also debugged the program to get labels on the plot to work, helped with testing, commenting, write the readme and worked on the API and Design document. Jimmy updated GraphicsChooser and wrote BarGraph so it uses reflection to select filter. 15 hours