Big Data Consultant

Ranko Mosic

Subscribe to Ranko Mosic: eMailAlertsEmail Alerts
Get Ranko Mosic via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Top Stories by Ranko Mosic

Hadoop and AWS are enterprise ready cloud computing, distributed technologies. It is straightforward to add more DataNodes i.e. storage to Hadoop cluster on AWS. You just need to create another AWS instance and add a new node to Hadoop cluster. Hadoop will take care of balancing storage to keep level of file system utilization across DataNodes as  even  as possible. Cloudera's distribution of Hadoop includes Cloudera Manager which makes it simple to install Hadoop and add new nodes to it. Screenshot below shows an existing HDFS service with two DataNodes. We will expand HDFS by adding a third DataNode to it: Once we click Add button new host can be picked from the list of available servers ( in this case it is server ip-10-0-0-40 ): New server is now DataNode-3, i.e. it is part of our Hadoop cluster. New DataNode-3 still does not contain any data. Hadoop Balancer ... (more)

Which Cloud Service Provider Should Host Your Oracle Databases?

Big shift towards Cloud environment has started. It is now clear that this change is similar in magnitude to the shift from mainframe to client-server computing two decades ago. Amazon Web Services is the pioneer and market leader in Cloud computing space. Other vendors are playing catch up and do not come close to the breadth and scale of AWS offerings. Services and features Amazon provides are quite extensive and cover many of the enterprise-class computing needs. APIs and command line interfaces are available for each service, which makes scripting and automation achievable. ... (more)

Economical Data Warehousing Using Amazon Web Services and Hadoop

Sqoop makes it very easy to transfer data between Oracle and Hadoop using a single command. The reason why we would want to import data from an Oracle database into Hadoop/Hive is that we might want to join Hive tables with Oracle lookup tables, or other data residing in Oracle database. Data originating from an Oracle database can help better understand and analyze raw, more granular data contained in Hive/HDFS. Sqoop uses JDBC driver to connect to an Oracle database. If you have a table results in your Oracle database and want data from it to be imported to Hadoop HDFS ( Hadoop... (more)

Column Store, In-Memory, MPP Databases and Oracle

( For latest information on Oracle 12c database update please refer to the following article: Oracle 12c Database and How It Relates to SAP Hana ) RDBMSs are stable and mature products. While there is nothing radically new on horizon that would challenge Codd's relational theory and related advances in data processing there are some developments that force established vendors like Oracle to come up with new features and products. Column Stores and Oracle Column store concept has been around for quite a while. Vendors like HP Vertica grabbed some market share in data warehousing seg... (more)

Mainstream Business Applications and In-Memory Databases

(Please refer to the following article: Oracle 12c In-Memory Database is Out - Hardly Anybody Notices for update on Oracle 12c databases) Contemporary large servers are routinely configured with 2TB of RAM. It is thus possible to fit an entire average size OLTP database in memory directly accessible by CPU. There is a long history of academic research on how to best utilize relatively abundant computer memory. This research is becoming increasingly relevant as databases serving business applications are heading towards memory centric design and implementation. If you simply place... (more)