3 Best Practices For Big Data Projects

#2: Bone Up On Open Source

Did we mention your team must be hip to open source platforms if it wants to have a hand in open data?

Most of the leading platforms that have cropped up over the past four years—HortonWorks, Cloudera, MongoDB, and umpteen others—are all firmly rooted in the open source tradition, and that's not likely to change soon. The linchpin for most of these technologies is Apache Hadoop, which enables data processing on information distributed over multiple servers or data sources. You'll also need to study up on Spark, which speeds processing on clusters.

"We are big believers that open source platforms are an integral part of the big data movement, and that open source is going to be incredibly successful," Bodkin said. "We have seen over the past four years that those beliefs continue to be true."

That might be a big philosophical switch for software integrators that have a legacy in the Windows or proprietary server world—as well as their customers.

That doesn't mean Microsoft isn't seriously focused on this space. It is, with technologies that bridge and unify SQLServer and Hadoop. Suffice to say, however, that investing in a big data practice eventually will require a significant training and technical investment.

(Image courtesy of iStockPhoto)