#data
Read more stories on Hashnode
Articles with this tag
Apache Spark stands out as one of the most widely adopted cluster computing frameworks for efficiently processing large volumes of complex data. It...
Today, we have a substantial amount of data, and it's not necessary that all the records are free from corruption. PySpark provides us with three...
Create a data frame with columns Name and Age. data=[("Alice", 25), ("Bob", 30), ("Alice", 25), ("Kate", 22)] cols= ["Name", "Age"] df =...
To install Kafka we need 2 things:- Instance of Zookeeper Instance of Kafka Therefore we will create 2 services in docker-compose.yaml. The role...
REDIS is an in-memory data structure store used as a database, cache, message broker, and streaming engine. How to install Redis? To install redis on...
Traditional data formats like CSV or JSON are human-readable formats and as the data is growing it is very difficult to store such unstructured or...