Types of tables in Hive:
1. Managed Table:
In case of managed table, data is controlled by the HIVE. Creating a table, creates a directory for the data on HDFS.Also on dropping the table, the data gets deleted as well.
Data is stored in the below location
And its not possible to store the data in different directory other than what is specified above.
•Data is controlled by HIVE
•Create a directory for the data on HDFS
•Dropping the table will delete the data as well
Create Managed table:
Step 1: Create the table using the below commands.By default it creates managed table.
Step 2: Load the data from the file into table and check the schema of the table(data is loaded from the local file system using the keyword LOCAL)
Step 3: Check the extended schema of the table
In case of external table, Hive does not delete the table (HDFS files) on dropping the table. It only deletes the metadata associated with the tables gets deleted by Hive.
Data can be stored at the desired location.
When to use external table:
For the same underlying data, if there are multiple schemas, we go for external tables.
Lets say I have a file with data.
CountryName population Area
India 88888888 8888
Singapore 99999 99
If I create two tables using this data say
Table1 -> countryName population
Table2 -> countryName Area
If I use managed table in this case and delete one table, other table wont have data.In this scenario,external table is useful.