Designs, develops and programs methods, processes, and systems to consolidate and analyze unstructured, diverse "big data" sources to generate actionable insights and solutions for client services and product enhancement.
Interacts with product and service teams to identify questions and issues for data analysis and experiments. Develops and codes software programs, algorithms and automated processes to cleanse, integrate and evaluate large datasets from multiple disparate sources. Identifies meaningful insights from large data and metadata sources; interprets and communicates insights and findings from analysis and experiments to product, service, and business managers.
Job duties are varied and complex utilizing independent judgment. May have project lead role. 5 years relevant work experience. BS/BA preferred.
Oracle is an Affirmative Action-Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability, protected veterans status, age, or any other characteristic protected by law.
The Oracle Data Cloud is an industry leader in connecting online and offline data to execute and measure the effectiveness of marketing initiatives. To enable these insights, the ODC relies on the power of the Oracle Identity Graph to connect thousands of disparate data sources to create comprehensive and accurate anonymized profiles across the numerous ID spaces where marketers are trying to reach consumers. These ID spaces include, but are not limited to, email, mobile phones, tablets, computers, TVs and postal address. Creating these anonymized profiles in a privacy safe and accurate manner is foundational to building targeting audiences at scale as well as detecting a clear signal when measuring the effectiveness of any campaign.
The ODC ID Graph is made possible because of a lot of data and a lot of data science. As a Senior Data Scientist within the Identity Data Science (iDS) Research & Development team, you'll be involved in developing a best-in-class ID Graph that fuels the ODC. In this role, you will be doing a blend of traditional data science work and big data/software engineering. Our responsibility is to build scalable, cost-conscious, stable, repeatable, and accurate machine learning and ETL pipelines in a cloud environment. This work may span all aspects of the data science and software development lifecycle.
We are looking for a statistical subject matter expert with a practical mind who can strike the right balance between statistical solutions and big data realities to enhance our Identity Platform capabilities.
Validate and expand upon existing data processing / statistical / machine learning approaches within the Identity Platform.
In-depth dissection of the data using statistical tools to help inform our Identity solutions.
Collaborate with other data scientists and engineers to design, research, and implement new data science products within identity resolution.
Be a source of knowledge and mentorship for data scientists on various aspects of data science from data analysis to machine learning algorithms to visualization.
Develop effecting methods for communicating insights (through visualization) on the quality of the ODC Identity Graph.
Explain data science concepts and implications of various approaches in a business setting to non-statistical counterparts in Product and Engineering.
MS or PhD in Computer Science, Mathematics, Physics, Engineering, Statistics, Econometrics, Operations Research or equivalent industry experience.
3 years of experience working with machine learning systems on large, complex data sources
Excellent at using visualization to summarize complex data into actionable and interpretable results (e.g. functional data, time series, multivariate data, correlation structures, spatial data)
Excellent with core statistical concepts of common machine learning models and when to apply them as well as machine learning model evaluation techniques; ability to assess, diagnose, and reason about a model's performance.
Excellent at working with domain and business experts to reduce complex data, statistical concepts, and results into tangible and actionable strategies for key stakeholders and executives across focus areas.
Proficient in Python
Proficient in common statistical modeling packages (e.g. scikit-learn, Spark ML)
Proficient in common statistical visualization packages (e.g. ggplot2, matplotlib)
Adaptability and willingness to incorporate effective software engineering best practices and tools (testing, modularization, version control, containers, etc.)
Expert at writing complex SQL queries.
Experience with cloud computing environments.
Experience with Scala.
Exposure with build pipelines (Jenkins), software build tools (python packaging, sbt, gradle) and container-based architecture a plus.
Ability to develop rapid prototypes and work cross-functionally to bring those prototypes to production.