Data integration

 

Efforts in genomics since the late 90s lead to major advances in understanding the biological process underlying the variation of traits, including key traits of agronomical interest. But a gap still remains for their application to the development of new cultivars beneficial to the society. New efforts are therefore urgently needed, in particular to bridge genomics and the diversity of crops. New programs should include extensive characterisation of genetic diversity in relationship with (i) new phenotyping approaches including proteomics and transcriptomics, (ii) mapping information at the genetic, physical and sequence level. Ultimately, all data should be combined using appropriate methodologies and softwares to extract new knowledge.

Our Information System, GnpIS, provides a framework allowing such an approach. It allows biologists or bioinformaticians to store and retrieve data from a large number of data types. These data can be combined to extract new knowledge. We are working to help biologists in the management and valorisation of their data in connection with those obtained in the community. We develop new tools to identify and to manage these connections. Interfaces are defined in collaboration with researchers to improve their ergonomy.

Information systems such as GnpIS play today a very central role in the analysis of the ever increasing flow of new genomic data. But new high throughput technologies are very challenging for data integration. Storage is not the only difficulty. Fast data access through queries is even more challenging as this is critical for researchers to efficiently work with their data. To face this new challenge, we develop the GnpIS architecture.