Table Viewer Specifications
The Table Viewer is the simplest Decision Support tool in the CWS aimed at providing an interactive view of datasets which enables breeders to select the best performing germplasm. The principal data type addressed by the Table Viewer is phenotypic data, but it could also include genotype results and germplasm information to allow integration of these three data types at the simplest level – simultaneous viewing. There are two basic table types of interest, a table of germplasm entries by different trait values and a table of entries by single trait values over different environments. The entries are generally represented as rows, and the traits or environments as columns, and there is probably no need to have a transpose function. The table of data as defined above is usually derived from an analysis process. In other words the values are generally means of some kind over replications and possibly over environments also. Also there is generally a header or footer section containing summary statistics (grand means, standard errors, genotypic variance, heritability etc).
In spite of these specifics, the Table Viewer is essentially just a data set view, mostly the observation sheet, with formatting functionality and so the specifications will be described in this context.
1. Allow the user to select a dataset (representation) for viewing by browsing the study tree or using a search function on study names.
2. Open a description page showing the factors, labels and variates in the chosen dataset. Similar to the Description sheet of the workbook. Display these in rows with summary statistics:
a) For each factor and label, the name, description, property, scale, method, data type and the number of distinct levels, those with a single level are conditions, for these add the vlaue.
b) For each variate, the name, description, property, scale, method, data type and a data summary – number of observations, min, max, mean for numeric variates, number of distinct classes for discrete variates.
3. Look of any effects in the study which are indexed by subsets of the set of factors defining the chosen data set which also contain variates with the same property as any of those in the selected set (they may not have the same name, scale or method).
These are the summary statistics which may go in the header or footer of the table. (eg. The selected dataset may be indexed by location and entry, and contain a variate with property GRAIN YIELD. There may be an effect in the study which is indexed only by location but which also contains two variates with property GRAIN YIELD, one called Mean Grain Yield and another called SE Grain Yield. These variates would contain the location means and SEs for the site means in the main dataset. There might also be GRAIN YIELD variates in the STUDY effect (ie only indexed by Study), containing the grand mean and SE and there may be GRAIN YIELD variates in an effect indexed only by Entry which would contain the means across locations.
Present all these variates in the same way as those in 2b, but in sections headed by the effect names and/or defining factors.
4. Allow the user to select items to display in the viewer by checking or unchecking against factor, label and variate names in the main section of the description page. Selecting or de-selection variates in the main section should have the same effect on variates in the summary statistics sections although these should also be de-selectable when the equivalent variate in the main section is selected. Indicate for each summary variate whether it should appear in the header of footer of the observation page.
5. If more than one factor is selected for display, offer the opportunity to 'parallelize' one of the factors. This means that the variates will be displayed in parallel blocks, one block for each level of the parallel factor. If only one variate is selected, there will only be one column in each block – for example GRAIN YIELD means for each location (where location is the parallel factor). Header and footer variates would be concomitantly parallelized provided their effects were indexed by the parallel factor – eg location means and SEs for each entry. Similarly, summary variates not indexed by the parallel factor, but indexed by the other factors of the main dataset, eg entry means over locations, would be presented as marginal columns, perhaps with their own header and footer summary statistics – eg grand means and SEs.
6. Also allow users to filter data on factor levels or data values to define a selected dataset to view. When a filter is active adjust the display described in 2 to show information about the selected data.
1. Load the selected dataset – factors, labels and variates into a grid view. Order factors and labels first, followed by variates. The first column should be a Row Number column with clickable cells. Clicking to select or de-select rows and right clicking to show a menu of row-specific actions.
a) Columns should be named with the factor, label and variate names used in the study. The variable description, property, scale and method should be visible in a mouse over hint. Column headers should be clickable to select columns and right-clickable to open menus.
b) If there is a parallel factor this should appear as a row below the column headers with levels below each block of display variates.
c) Header variates (summary statistics) should follow next (if any have been specified) by rows followed by a separator. The separator cells should be clickable to select columns and right-clickable to open menus.
d) Data rows follow for each level combination of non-parallel factors (as a serial table).
e) Footer variates (if there are any) should follow after a separator row with the same properties as in c.
f) A right-click on any cell should give a menu of cell specific action items such as select cell, deselect cell, highlight (color) cell.
2. There should be a menu bar at the top of the observation page with menus for Selected Cells, Selected Rows and Selected Columns. These should have action items applicable to the set of selected cells, rows or columns.
a) For Selected Cells the menu should allow users to highlight (color) selected cells (or remove highlights)
b) For Selected Columns the user should be able to hide selected columns or highlight (color) them or sort rows between the header and footer rows based on values in the selected columns. If several columns are selected then the sort rows item should allow the user to specify a sort order amongst the selected columns. If some selected columns are categorical, the user should be able to specify the sort order for the classes in those columns – numeric, alphabetical or custom. Custom sort order would involve specifying a numeric value for each class.
c) For Selected Rows, the user should be able to hide them, highlight them or move them to the header or footer section. For example you may select the rows corresponding to check entries and move these to the footer so that they are separated from the test entries and therefore not sorted or otherwise treated with those entries.
3. The action items behind a column header (or separator cell) should apply to the rows between the header and footer sections and depend on the type of data in the column.
a) They should all have the basic actions like select column, deselect column hide column, highlight (or color) column, sort ascending, sort descending . (All columns to be sorted together)
b) If they are numeric columns the menus should have items like highlight (or color) cells with the 'n' largest or smallest values where 'n' could have a default value, say 3, or a default percentage, say 5%, but 'n' could also be set somehow.
c) If the header or footer rows of a numeric column contain means and SEs (determined from the method values) then a menu item should be to highlight outliers – values more than x SEs away from the mean where x could be 2.5 or 3.0 but could also be set somehow.
d) The menu for numeric columns should have an item to select 'favorable direction' for the trait. Eg largest is best or smallest is best. The highlight colors for 'n' largest or smallest values should then be set for 'n' most favorable or unfavorable values.
e) For columns with categorical variates the menu should allow picking most favorable and unfavorable categories and highlighting cells containing those classes, or specifying a sort order for the classes (numeric, alphabetical, or custom) and sorting on the column in ascending or descending order.
f) If the column contains a factor or label, then the items appropriate to categorical variates (e above) are relevant. In addition you might want to hide, highlight or move (to header or footer) all rows with certain levels of the factor.
4. Many column operations should be possible to insert, move, and fill columns.
a) All column header menus should have an item to copy the current column or insert a (blank) column to the left or right of the current column. These new columns should be editable (original columns should not, at least not by default) Blank columns should have extensive 'fill with' items in their menus and need to be named and described either through the 'fill with' process or directly. For example fill with sequence or fill with index where an index is a linear or logical combination of other columns in the table.
b) If a column has property Germplasm Identifier and scale DBID (ie is a GID) then the menu should have options to insert columns with many germplasm features – cross expansion, name of specified type, unique identifier, attribute of specified type, female parent GID or name, male parent GID or name, source GID or name. Also the user should be able to insert columns with trait values or genotype values from queries based on the GIDs.
c) If a column has property LOCATION and scale DBID then the menu should have items to insert new columns with many location features like name, abbreviation, country, latitude, longitudeand location descriptors of specified types.
The Table should allow many rows and columns with horizontal and vertical scroll bars. Vertical scroll bars should operate between the header and footer sections.
Fonts, column widths and row heights should be highly scaleable.
The user should be able to save a table state (in a file?) and print a table.