Welcome!

Big Data Consultant

Ranko Mosic

Subscribe to Ranko Mosic: eMailAlertsEmail Alerts
Get Ranko Mosic via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by Ranko Mosic

Machine Learning is a critical part of extracting value from Big Data. Choosing proper model, preparing data and getting usable results on large scale data is non-trivial exercise. Typically process consists of model prototyping using higher level, (mostly) single machine based tool like R, Matlab, Weka, then coding in Java or some other language for large scale deployment. This process is fairly involved, error prone, slow and inefficient. Existing tools aiming at automating and improving this process are still somewhat immature and wide scale Machine Learning enterprise adoption is still low. Efforts are under way to address this gap i.e. to make enterprise class Machine Learning more accessible and easier. Spark is new, purpose-built, distributed, in-memory engine that makes it possible to perform compute intensive jobs on commodity hardware clusters. One of ap... (more)

Oracle 12c In-Memory, Columnar Database & How It Relates to SAP Hana

All major relational database vendors are developing or already shipping  in-memory, columnar databases. The next release of Oracle 12c - an in-memory, columnar database, will be available next year. It will feature simultaneous transaction-level updates to both  row and column stores i.e. data will be stored in both formats at the same time, in the same transaction. This is quite an improvement over SAP Hana's awfully clumsy delta merge process ( data changes in SAP Hana are first accumulated in delta store, then periodically merged into column store - process which locks targe... (more)

Column Store, In-Memory, MPP Databases and Oracle

( For latest information on Oracle 12c database update please refer to the following article: Oracle 12c Database and How It Relates to SAP Hana ) RDBMSs are stable and mature products. While there is nothing radically new on horizon that would challenge Codd's relational theory and related advances in data processing there are some developments that force established vendors like Oracle to come up with new features and products. Column Stores and Oracle Column store concept has been around for quite a while. Vendors like HP Vertica grabbed some market share in data warehousing seg... (more)

Big Data, Machine Learning and Innovation

Big Data and its most prominent technical ingredient, Machine Learning, are all the rage these days, as IT industry is trying to convince companies technology revolution is underway. ( "If you are not doing it, your competitors sure are, and by the time you realize it, it will be too late" ). Data fracking, i.e. Big Data, is 21 century new oil of that will power and grease stalled industries and reignite growth, or so the story goes. While advanced analytics (it comes under various names - predictive analytics, data mining, and data science, more recently) is great and in use fo... (more)

Mainstream Business Applications and In-Memory Databases

(Please refer to the following article: Oracle 12c In-Memory, Columnar Database & How It Relates to SAP Hana for update on IMDB/Columnar databases) Contemporary large servers are routinely configured with 2TB of RAM. It is thus possible to fit an entire average size OLTP database in memory directly accessible by CPU. There is a long history of academic research on how to best utilize relatively abundant computer memory. This research is becoming increasingly relevant as databases serving business applications are heading towards memory centric design and implementation. If you si... (more)