Data Science Asked by user87418 on September 16, 2020
I am following this tutorial:https://docs.microsoft.com/en-us/learn/modules/intro-to-azure-databricks/4-using-notebooks
In this tutorial we create a database like this:
%sql
CREATE DATABASE IF NOT EXISTS Databricks;
USE Databricks;
CREATE TABLE IF NOT EXISTS AirlineFlight
USING CSV
OPTIONS (
header="true",
delimiter=",",
inferSchema="true",
path="dbfs:/mnt/training/asa/flights/small.csv"
);
CACHE TABLE AirlineFlight;
SELECT * FROM AirlineFlight;
Where is this database created? moreover there is a question asked
Question: Which of the following are good applications for Apache Spark? (Select all that apply.)
Querying, exploring, and analyzing very large files and data sets
Joining data lakes
Machine learning and predictive analytics
Processing streaming data
Graph analytics
Overnight batch processing of very large files
Updating individual records in a database
Answer: All but #7. Apache Spark uses SQL to read and performs analysis on large files, but it is not a Database.
If we can create a database using spark then why can’t we change its records too
Where is this database created?
A powerful paradigm in modern data storage and processing is the separation of compute and storage. Building systems with decoupled compute and storage has benefits associated with scalability, availability, and cost.
Apache Spark loads and performs computation on the data - it is a distributed data processing engine. It does not handle permanent storage. In Databricks (you are using databricks documentation) data is often stored in Delta Lake, which is specifically designed to work with Spark. However, Spark can work with data stored in many other ways, such as other cloud storage (eg Amazon S3, Azure Blob), traditional SQL databases, NoSQL databases, HDFS and many more.
Answered by Robert Long on September 16, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP