Open in app
Home
Notifications
Lists
Stories

Write
Ramses Alexander Coraspe Valdez
Ramses Alexander Coraspe Valdez

Home

Published in Geek Culture

·Pinned

A Text Analysis of Andres Manuel Lopez Obrador’s Speeches

Text analytics with python — This article proposes to analyze the text of the speeches, conferences and interviews of the current president of Mexico, and has an educational aim, there are no purposes of political interest in this document, you are free to interpret the data in your own way. I personally think that formalize…

Text Analytics

9 min read

A Text Analysis of Andres Manuel Lopez Obrador’s Speeches
A Text Analysis of Andres Manuel Lopez Obrador’s Speeches

Published in AWS in Plain English

·Pinned

Build an ETL pipeline with Apache Airflow and Visualizing AWS Redshift data using Microsoft Power BI

Uber expenses tracking — Uber expenses tracking Have you heard phrases like Hungry? You’re in the right place or Request a trip, hop in, and relax. ? Both phrases belong to the emblematic company with disruptive technology, UBER, which is evolving the way the world moves with two strong businesses with millionaire revenue, Uber Rides and Uber…

Data Engineering

11 min read

Building an ETL pipeline with Apache Airflow and Visualizing AWS Redshift data using Power BI
Building an ETL pipeline with Apache Airflow and Visualizing AWS Redshift data using Power BI

Published in Geek Culture

·Pinned

Building an Amazon Prime content-based Movie Recommender System

TF-IDF, Cosine similarity, BM25, BERT — The aim of this article is to show you how to quickly create a content-based recommendation system. When you select a movie on platforms such as Amazon Prime or Netflix you may also notice that they will always show you similar movies that may be to your liking, this document…

Text Analytics

5 min read

Building an Amazon Prime content-based Movie Recommender System
Building an Amazon Prime content-based Movie Recommender System

Published in Python in Plain English

·Pinned

Understanding Similarity Measures for Text Analysis

Distance Metric of similarity — The aim of this article is analyze how similar two words or terms are to each other, Distance metrics are the most common techniques to use for mesure similarities between words, there is an area called Document similarity that is responsible for searching similarities between sentences or paragraphs of text…

Text Analysis

5 min read

Understanding Similarity Measures for Text Analysis
Understanding Similarity Measures for Text Analysis

Published in Geek Culture

·Pinned

Dockerizing an Apache Spark Standalone Cluster

Small Big Data ecosystem with In-Memory Processing — The aim of this article is to show you the use of docker as a powerful tool to deploy applications that communicate with each other in a fast and stable way. In this case I present a small ecosystem of big data for in-memory processing, the ecosystem is based in…

Big Data

5 min read

Dockerizing an Apache Spark Standalone Cluster
Dockerizing an Apache Spark Standalone Cluster

Published in Python in Plain English

·Apr 20

How to Build a Lossless Data Compression and Data Decompression Pipeline

A parallel implementation of the bzip2 high-quality data compressor tool in Python. — The image above shows the architecture of a parallel implementation of the bzip2 data compressor with python, this data compression pipeline is using algorithms like Burrows-Wheeler transform (BWT) and Move to front (MTF) to improve the Huffman compression. But for now, this tool only will be focused on compressing .csv…

Big Data

6 min read

How to Build a Lossless Data Compression and Data Decompression Pipeline
How to Build a Lossless Data Compression and Data Decompression Pipeline

Nov 21, 2021

Introduction to Apache Spark

Apache Spark — APACHE SPARK Apache spark is a tool widely used by data engineers, data scientists, or machine learning engineers, it was designed and built in 2009 at UC Berkeley, it is the evolution of the old paradigm that used HADOOP with the Mapreduce algorithm, Apache spark it acts as a distributed system, spreading…

Apache Spark

13 min read

Introduction to Apache Spark
Introduction to Apache Spark

Published in AWS in Plain English

·Jun 24, 2021

Build a Big Data Pipeline with PySpark and AWS EMR on EC2

How to Build a Big Data Pipeline with PySpark and AWS EMR on EC2 Spot Fleets and On-Demand Instances — If you are a data scientist and you are ready to take the next step in your career and become an applied scientist you must leave behind school projects that involve working with small datasets. The true nature of an applied scientist is knowing how to take advantage of computing…

Big Data

7 min read

Build a Big Data Pipeline with PySpark and AWS EMR on EC2
Build a Big Data Pipeline with PySpark and AWS EMR on EC2

Jun 3, 2021

Reducing AWS EMR costs with Spot, Task Nodes, and Instance Types
84
2

23andMe Engineering

Thank you very much for this information, very useful.

Thank you very much for this information, very useful.

1 min read

Thank you very much for this information, very useful.

--

--


Mar 9, 2021

Deep Learning and Computer Vision: From Basic Implementation to Efficient Methods
656
3

DataTurks: Data Annotations Made Super Easy

check this out:

check this out: https://wittline.github.io/Computer-Vision-and-Deep-Learning/

1 min read

check this out:

https://wittline.github.io/Computer-Vision-and-Deep-Learning/

--

--

Ramses Alexander Coraspe Valdez

Ramses Alexander Coraspe Valdez

Ramses Alexander Coraspe Valdez

Following
  • Anthony Eichberger

    Anthony Eichberger

  • A. S. Deller

    A. S. Deller

  • Matt Weingarten

    Matt Weingarten

  • Deephaven Data Labs

    Deephaven Data Labs

  • Virendra Kumar Shrivastava

    Virendra Kumar Shrivastava

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Knowable