Breaking News
Loading...

Metastore





Metastore stores the metadata of hive tables.
 It provides two important but often over looked features of a data warehouse: data abstraction and data discovery. 
Without the data abstractions provided in Hive, user has to provide information about data formats, extractors and loaders along with the query. In Hive, this information given during table creation and reused every time the table is referenced. This is very similar to the traditional warehousing systems. The second functionality, data discovery, enables users to discover and explore relevant and specific data in the warehouse. Other tools can be built using this metadata to expose and possibly enhance the information about the data and its availability. Hive accomplishes both of these features by providing a metadata repository that is tightly integrated with the Hive query processing system so that data and metadata are in sync.
Derby database is the default metastore,but it is useful only for single developer. Its not useful for production as multiple users cannot use this derby database. One user has to log off for the other user to use the derby database.
Other options are relational databases which has  ODBC and JDBC support.
The difference between local and remote metastore is ,in local metastore ,the metastore runs in the same JVM where hive is running where as in case of remote metastore,both hive and metastore run in separate in JVMs.
In production,generally remote metastore is used.

FAQ:
1. What is Hive Metastore? 
Ans : Hive metastore is a database that stores metadata about your Hive tables (eg. table name, column names and types, table location, storage handler being used, number of buckets in the table, sorting columns if any, partition columns if any, etc.). When you create a table, this metastore gets updated with the information related to the new table which gets queried when you issue queries on that table. 

2. Is it possible to use same metastore by multiple users, in case of embedded hive? 
Ans: No, it is not possible to use metastore in sharing mode. It is recommended to use standalone “real” database like MySQL or PostGresSQL.
- See more at: http://labstrikes.blogspot.in/2012/08/adsense-middle-blog-post.html#sthash.gQgSkqx8.dpuf
 
Toggle Footer