[HAL] BigDataFr recommends: HaoLap – a Hadoop based OLAP system for big data #datascientist

BigDataFr recommends: HaoLap – a Hadoop based OLAP system for big data Abstract ‘In recent years, facing information explosion, industry and academia have adopted distributed file system and MapReduce programming model to address new challenges the big data has brought. Based on these technologies, this paper presents HaoLap (Hadoop based oLap), an OLAP (OnLine Analytical […]

[Visionary Marketing – Video] BigDataFr recommande: Les data scientists avec thomas gerbaud de Mantiq #datascientist

BigDataFr recommande: Les data scientists avec Thomas gerbaud de Mantiq Extrait de la vidéo : « Pour moi […], un data scientist c’est quelqu’un qui va récupérer des données, quel que soit le format, quelle que soit la taille, et va appliquer des outils efficaces pour en tirer quelque chose. En gros, il va devoir, à […]

[Galvanize] BigDataFr recommends: Eight Tools That Show What’s on the Horizon for the Python Data Ecosystem

BigDataFr recommends: Eight Tools That Show What’s on the Horizon for the Python Data Ecosystem ‘Galvanize recently attended the Dato Data Science Summit in San Francisco, a gathering of more than 1,000 data scientists and researchers from industry and academia to discuss and learn about the most recent advances in data science, applied machine learning, […]

[CIO] BigDataFr recommends: How data science can turn the vision of connected vehicles into reality #datascientist #machine learning

BigDataFr recommends: How data science can turn the vision of connected vehicles into reality ‘As per NHTSA statistics, more than 32,000 people lost their lives in the United States in 2013 in road accidents. There is no better use for technology than saving lives. Connected vehicles represent a seismic movement that is ready for prime […]

[O’R] BigDataFr recommends: What it means to “go pro” in data science #datascientist

BigDataFr recommends: What it means to “go pro” in data science ‘My experience of being a data scientist is not at all like what I’ve read in books and blogs. I’ve read about data scientists working for digital superstar companies. They sound like heroes writing automated (near sentient) algorithms constantly churning out insights. I’ve read […]

[HAL] BigDataFr recommends: NumaGiC – a Garbage Collector for Big Data on Big NUMA Machines #datascientist #machine learning

BigDataFr recommends: NumaGiC – a Garbage Collector for Big Data on Big NUMA Machines Introduction (excerpt) ‘Data-analytics programs require large amounts of computing power  and  memory.  When  run  on  modern  multicore  computers with a cache-coherent Non-Uniform Memory Access (ccNUMA)  architecture,  they  suffer  from  a  high  overhead during garbage collection (GC) caused by a bad memory […]

[arXiv] BigDataFr recommends: Improving Big Data Visual Analytics with Interactive Virtual Reality #datascientist #machine learning

BigDataFr recommends: Improving Big Data Visual Analytics with Interactive Virtual Reality ‘For decades, the growth and volume of digital data collection has made it challenging to digest large volumes of information and extract underlying structure. Coined ‘Big Data’, massive amounts of information has quite often been gathered inconsistently (e.g from many sources, of various forms, […]

[HAL] BigDataFr recommends: FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data #datascientist

BigDataFr recommends: FP-Hadoop: Efficient Execution of Parallel Jobs Over Skewed Data Abstract ‘Big data parallel frameworks, such as MapReduce or Spark have been praised for their high scalability and performance, but show poor performance in the case of data skew. There are important cases where a high percentage of processing in the reduce side ends […]