Data mining
The IT dedicated to the intensive exploitation of large volumes of data is mainly related to the research activities of the Ondes team, activities related to methods based on ambient noise. This expertise is at the interface between geophysicists, HPC specialists and databases in the context of national and European projects.
The codes and associated documentation developed as part of the activity "Data Mining" are accessible in the Forge GitLab GRICAD (or the Forge OSUG for projects not yet migrated)). It should be noted that the software forges and/or documentation of some of these projects are restricted access.
WIN->MSEED format conversion tools / pre-processing / correlations / doublets & inversion (Whisper and F-Image projects) :
- code outils de conversion WIN vers MSEED
- wiki et journal projet Whisper, documentation données Japonaises, documentation et code prétraitements des données
- documentation et code outils corrélation/doublets/inversion
Beamforming tools (Imag’In, RESOLVE projects) :
- wiki RESOLVE
- documentation and code
- documentation MFP code
- documentation and code (old version, no longer maintained)
Visualization tools for beamforming outputs (collaboration with R. Blanch and M. Ortega from LIG)
- documentation and code
Template Matching Tools (EventDetection project)
– documentation and code
NoiseCorr_DBF : Correlation tools and double beamforming (sanjacinto project) :
– documentation and code
– documentation (old wiki)
Time error detection tools on dense networks (IWORMS project) :
– online journal
– documentation and code
– wiki
Tools for the manipulation/reorganization of datasets of valued data in HDF5 format (Utils project) :
– documentation and code
Tools for performing flow velocity and particle concentration measurements based on Acoustic Particle Image Velocimetry (projet ImVort = Imagerie-Vorticité) :
documentation et code
Prototyping tools to link RESIF data center data and CIMENT-GRICAD HPC infrastructures (projet Resif-Summer-Ciment et code)
Autres :
Lien vers les supports de la formation interne HDF5 pour les personnels RESIF, SIG, IPGP
Lien vers la Formation CiGri, et Support de la présentation
Lien vers l’offre de formation du site : outils pour le traitement de données, le développement logiciel et le calcul (mise à jour au fur et à mesure du déroulé des séances, version complète sur demande)
The business expertise of the technical staff involved are :
– optimization of sequential codes (numerical methods, choice of languages,...)
– application parallelization (MPI, OpenMP)
– application deployment
– grid calculation (CiGri v3)
– iRODS : transfer techniques, metadata management,...
– IO parallel
– HDF5 data format
– signal processing : hole management, data decimation,...
– Fortran / C / Python3 / Shell Bash
Contacts for the activity ’Data mining’ :
– Michel Campillo, Philippe Roux, Florent Brenguier : researchers, F-Image project manager, RESOLVE, Pacific
– Albanne Lecointre, IR CNRS BapE, team Waves
Specific skills
– Fortran, C, Python3
– MPI, OpenMP, Grid calculation....
– Scientific computation libraries BLAS Lapack, Scipy, IntelMKL
– h5py, opspy, numPy, scipy, matplotlib, ...
– File format HDF5, SEED, (NetCDF3)
– data scan : msi
– format conversion : sac2mseed, win2sac_32
– metadata mining : rdseed
– Data type : seismological
Hardware and software resources backed by the activity’Data mining’
– Link to the CIMENT/GriCAD computing center
– Link to the ISTerre calculation means
Links with other IT activities and resources in ISterre and OSUG
==> ISterre Data Centre
==> Laboratory IT resources
==> OSUG Storage Center
Links with other ISTerre technical platforms
Last updated on 07/04/2022