The Pedigree Import tool will be developed as a web-based application using PHP. The functionality for this tool has been specified and discussed with users. It is initially designed to be composed of two main forms, one for searching existing germplasm records and another for inputting new germplasm information. The application to provide:

  • customization settings for the user to modify the behavior of the searching and parsing algorithms.
  • option to consider commonly interchanged characters to include a certain degree of error in the search for germplasm names.
  • a list of pre-defined nomenclature rules that the user has the option to select and apply.
  • suggest simple substring replacements in pedigree names for making corrections to historical pedigree entries.
  • options for managing multiple germplasm names and attributes within a single germplasm entry or across several related or unrelated germplasm.

The main form for searching existing germplasms is intended to verify from the user if the germplasm to be imported already exists in the database. It will show a variety of information related to the matching germplasm entries. These involve the list of matching names, the pedigree tree of the currently selected match, the germplasm origin information, the other names and attributes for that germplasm, the relatives and neighborhoods to which that germplasm belong, the germplasm lists and studies where that germplasm was used, and available images of seeds and plants to show the phenotypic characteristics of the currently selected germplasm match.

The main form for inputting a new germplasm entry provides two complementary tools for parsing the pedigree string:

  • a fully automated parsing function that once executed will parse the pedigree string based on the selected nomenclature rules initially set by the user as mentioned above.
  • The result of the parsing algorithm is shown as a pedigree tree with various nodes pertaining to germplasm entries. A fully automated parsing continues until it encounters four possible scenarios for a germplasm entry: (1)the name of the germplasm node already exists in the database; (2) the germplasm node is listed as one of the breakpoint nodes; (3) there is an irregularity in the germplasm name so the parsing cannot continue further; (4) the parsing algorithm can no longer identify any parsing symbol to break the germplasm name further.

The accompanying tool is the semi-automated parsing interface. This allows the user to make the necessary corrections should there be an irregularity in the germplasm name; for example, an unpaired grouping symbol or an unsupported crossing symbol. A set of controls allow the user to edit portions of the pedigree string, or specify the split in crosses or the generation in derivatives, and then check the resulting split for correctness, and apply the changes made back to the pedigree tree.

A tabbed section of the form allows the user to define the origin information for each germplasm node, as well as other names and attributes for each germplasm. A page in this tabbed section holds the list of breakpoint nodes. These nodes are manually set by the user by selecting a node in the pedigree tree where the user wishes to force a stop in the automated parsing, and then pressing the breakpoint button. To continue parsing past the breakpoint node, the user only needs to remove the node from the list.

Pedigree Import Workflow

