Big Data Consultant

Ranko Mosic

Subscribe to Ranko Mosic: eMailAlertsEmail Alerts
Get Ranko Mosic via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Top Stories by Ranko Mosic

Machine Learning is a critical part of extracting value from Big Data. Choosing proper model, preparing data and getting usable results on large scale data is non-trivial exercise. Typically process consists of model prototyping using higher level, (mostly) single machine based tool like R, Matlab, Weka, then coding in Java or some other language for large scale deployment. This process is fairly involved, error prone, slow and inefficient. Existing tools aiming at automating and improving this process are still somewhat immature and wide scale Machine Learning enterprise adoption is still low. Efforts are under way to address this gap i.e. to make enterprise class Machine Learning more accessible and easier. Spark is new, purpose-built, distributed, in-memory engine that makes it possible to perform compute intensive jobs on commodity hardware clusters. One of ap... (more)

HBase Big Data on Amazon Web Services

Hadoop is designed to store extremely large volumes of data. HBase, an open source NoSQL data store, makes it possible to randomly access such large data sets. HBase is included in Cloudera's Hadoop distribution. One of the major obstacles to a wider adoption of NoSQL databases is the lack of query languages, i.e., lack of comprehensive non-programmatic interfaces to data inside NoSQL data store. We expect NoSQL databases to come up with such query languages in near future. In meantime, Quest's Toad for Cloud fills this gap and makes it easy to seamlessly access NoSQL, Cloud and ... (more)

Towards Next Generation Enterprise IT

Data processing power is likely to continue growing. Are contemporary IT development methods, processes and procurement practices properly positioned to take advantage of increasing capabilities? CPU/Memory/Storage can today be provisioned in a few clicks. Limitations presented by processing power and physical infrastructure will continue to be of less importance, as was the case in the past. We are gradually coming close to a situation where constraints determining present corporate IT standards are not an issue any more. For example, it is current practice that OLTP and analyti... (more)

Oracle Database Backups to the Amazon Cloud

The traditional way of performing backups includes using Oracle RMAN in combination with media management layer software ( typically Netbackup, Tivoli or similar ), which writes backup data to remote robotic tape unit. Tapes are then stored offsite to a secure location. It is well known fact that tape media poses certain challenges in reliability and physical manipulation areas. Cloud-based backups' main attraction is that they are inherently disk based, always accessible, offsite and there are no capex expenditures. All tape related costs are thus eliminated. On the other hand ... (more)

Oracle Disaster Recovery Site Hosted by Amazon Cloud

DR sites are typically built as an exact replica of the primary site. Application and database software is installed on DR site and sits there mostly unused, waiting for a disaster to happen. DR site is very expensive proposition than only large companies are able to afford. Amazon AWS is an interesting alternative to having your own DR site. Oracle databases on the DR side are in Data Guard configuration with a primary site and actively apply archive log files shipped from there. Pay per use, scalable Amazon Cloud model makes it an attractive alternative to creating and maintai... (more)