Python Data Engineer Houston, TX

ENVISN INCORPORATED

Python Data Engineer

Full Time • Houston, TX
Job Title: Python Data Engineer
Location: Houston, TX (Onsite role)
Duration: Long term contract
 
Job Description:
We are looking for a talented Data Engineer with expertise in Python data processing. The ideal candidate will have a strong background in Python API development, parallel data processing, and distributed systems design. You will be responsible for building and maintaining systems that handle large-scale data processing tasks, ensuring high performance and scalability.

Key Responsibilities:
Python API Development:
o    Develop and maintain RESTful APIs using Python web frameworks such as FastAPI or Django.
o    Collaborate with front-end developers to integrate user-facing elements with server-side logic.

Parallel Data Processing:
o    Utilize Pandas, NumPy, and other libraries to process large datasets efficiently.
o    Implement multithreading, multiprocessing, and asynchronous programming techniques.
o    Optimize data processing pipelines to handle millions of rows with minimal latency.

Distributed Systems Design:
o    Design and implement distributed systems with a focus on scalability and reliability.
o    Understand and apply core concepts such as load balancing and task queues.
o    Use Docker to containerize applications and manage dependencies.
o    (Preferred) Experience with Kubernetes for container orchestration.

Technical Communication:
o    Clearly articulate complex technical concepts to team members and stakeholders.
o    Document system designs, processes, and code effectively.
o    Collaborate with cross-functional teams to align on project goals and deliverables.


Must-Have Qualifications:
 
Experience in Python Web Frameworks:
o    Proficiency with FastAPI, Django, or similar frameworks.
O   C# coding
o    Understanding of RESTful API principles and best practices.

Docker Knowledge:
 o    Ability to create and manage Docker Files.
 o    Experience with containerization for deployment and development workflows.
 
 Systems Design Understanding:
o    Basic knowledge of load balancing, task queues, and distributed system concepts.
o    Ability to design systems that are scalable and maintainable.

Concurrent and Parallel Computing Skills:
o    Proficiency in multithreading and multiprocessing without relying solely on external libraries or frameworks.
o    Familiarity with asynchronous programming, particularly asyncIO in Python.

Communication Skills:
o    Excellent technical communication abilities.
o    Experience collaborating in team environments and conveying complex ideas clearly.


Preferred Qualifications:
Education:
o    BS or MS in Computer Science

Advanced Data Processing Tools:
o    Experience with Polars, PySpark, or similar tools.
o    Handling of large-scale data processing tasks efficiently.

Distributed Computing Experience:
o    Hands-on experience with distributed architectures in Docker.
o    Familiarity with concepts like task queuing, MapReduce, and saga patterns.

Kubernetes Experience:
o    Knowledge of container orchestration using Kubernetes.
o    Experience deploying and managing applications in a Kubernetes cluster.

Problem-Solving at Scale: 
o    Demonstrated ability to solve complex problems using parallel or distributed computing. 
o    Innovative thinking beyond single-threaded processes.
 
Compensación: $45.00 - $50.00 per hour




(si ya tienes un currículum en Indeed)

O aplicar aquí.

* campos requeridos

Ubicación
Or
Or