Because he wanted to do large-scale phenological studies over Europe and the continental US, Zurita Milla approached the eScience Center a few years ago.
“I had a lot of data. Our models were also relatively slow and we wanted to speed them up by running them on distributed systems.”
Figure 1: an example of aggregated green wave data over North America
Learning from each other
The technology that the phenology researcher works with has changed enormously over time.
“When I was a student we had a desktop application, we were told what buttons to press, and all data was analysed locally. During my PhD, I moved into programming to automate data analysis. Now I need to switch to big data programming frameworks because the data simply don’t fit on my computer. Researchers should not download data anymore but move their code to the cloud, where all the data are available.”
“Thanks to the collaboration with the eScience Center, I was a bit of a pioneer with this way of working at my faculty. Just before the COVID-19 pandemic, we started our own big data centre, because we realised that we had to invest in these technologies and tools.”
Ready-to-use tools for earth observation
For several years, Zurita Milla, the eScience Center and SURF worked in a so-called alliance project, that enabled them to learn from each other.
“They wanted to know more about geospatial data, and we wanted to acquire knowledge about big data solutions. We discussed the algorithms in and out with the research software engineers at the eScience Center, and SURF provided the computing infrastructure.”
With the insights from this project and similar ones, the eScience Center and SURF have now developed ready-to-use tools, infrastructure and storage for Earth Observation data.
“This has lowered the entry barrier to big data for the geospatial community.”