L3 Ops - Server Operations

Morgan Stanley New York , NY 10007

Posted Yesterday

Company Profile

Morgan Stanley is a leading global financial services firm providing a wide range of investment banking, securities, investment management and wealth management services. The Firm's employees serve clients worldwide including corporations, governments and individuals from more than 1,200 offices in 43 countries.

As a market leader, the talent and passion of our people is critical to our success. Together, we share a common set of values rooted in integrity, excellence and strong team ethic. Morgan Stanley can provide a superior foundation for building a professional career - a place for people to learn, to achieve and grow. A philosophy that balances personal lifestyles, perspectives and needs is an important part of our culture.


Technology works as a strategic partner with Morgan Stanley business units and the world's leading technology companies to redefine how we do business in ever more global, complex, and dynamic financial markets. Morgan Stanley's sizeable investment in technology results in quantitative trading systems, cutting-edge modeling and simulation software, comprehensive risk and security systems, and robust client-relationship capabilities, plus the worldwide infrastructure that forms the backbone of these systems and tools. Our insights, our applications and infrastructure give a competitive edge to clients' businessesand to our own.


The Unix Operations team is responsible for implementing and managing the Linux infrastructure for Morgan Stanley. The group is involved in evaluation, certification, integration, and maintenance of various products, including hardware, Operating Systems, such as Red Hat Linux and Solaris, system services (DNS, DHCP, NTP, syslog, etc.), file systems (NFS, AFS, Hadoop and various cluster file systems), High Availability, Virtualization technologies and a variety of in-house developed tools.

We interact with a high number of customers from numerous business units to generate improvements that ensure the smooth operation of the plant without being hard to manage. We also liaise with engineering groups to set direction for the many disciplines that are part of our day-to-day service portfolio (storage, core infrastructure services and special projects) and create different solutions for the low-level components that makes our infrastructure tick. We interact with many customers from numerous business units as a mechanism to generate improvements to be able to 'run the bank' more efficiently and effectively.

The role is critical to our day to day incident management function with primary responsibilities for:

  • Diagnosis and resolution of immediate production impacting issues in the electronic trading, compute and storage plants

  • Working with other infrastructure teams including networking, database administration and hosted solution teams for outage resolution, as well as customers aligned with the business users of our plant to determine scope, impact, and appropriate resolution path

  • Carry out proactive health & hygiene tasks to maintain operational stability and compliance for risk & control programs to ensure the production environment is not put at risk

  • Collaborate with engineering teams to test and certify new hardware & software products

  • Collaborate with application development/support teams for proof-of-concept setups for in-house-developed and/or ISV-supplied products.

  • Occasional weekend project work responsibilities to on-board new UNIX assets for growth or large programs such as new datacenter build outs

  • Ability to read complex code and also write scripts using Shell, Perl and Python.

Skills Required:

  • Must have strong knowledge and experience with Linux, preferably Redhat, and/or any other Linux distributions.

  • Knowledge and experience of various services i.e DNS, DHCP, NTP, Kerberos, SSHD, PXE, SFTP, HTTPD etc.

  • Knowledge of various enterprise server hardware models [blades, rackmount, standalone] networking, routers and switches.

  • Must be able to read, understand and write intermediate to complex scripts using KSH, Bash, Perl, Python etc.

  • Excellent communication and written skills. Being able to explain technical problem to non-technical audience.

  • Available for on-call (1 week out of every 4-6 weeks), rotated weekly within the team, and become a point person for any production issues.

  • Ability to work in a global distributed team.

Skills Desired:

  • Experience with trouble shooting incidents involving compute resources, network problems, remote storage related problems [SAN, NAS] etc.

  • Experience with analyzing and diagnosing kernel carsh/core dumps, network packet captures and identifying the root cause of problems from those.

  • Sound knowledge of networking, TCP/IP, Layer 2/3 network design, bonding, routing, firewalls (host- and appliance-based), switches and routers etc.

  • Experience working in a DevOps environment.

  • Knowledge and experience with various server hardware models and vendors i.e. IBM, Dell, HP etc.

  • Ability to identify performance bottlenecks and tune the system parameters to provide more throughput.

  • Good understanding and knowledge of Load Balancing, High Availability and BCP.

  • Good understanding and workings of configuration management tools, Redhat Satellite servers, Puppet, Chef, SaltStack, etc.

  • Good knowledge and understanding of Clustering, Virtualization, NAS, NFS and SAN

Find your dream job anywhere
with the LiveCareer app.
Download the
LiveCareer app and find
your dream job anywhere

Boost your job search productivity with our
free Chrome Extension!

lc_apply_tool GET EXTENSION

Similar Jobs

Want to see jobs matched to your resume? Upload One Now! Remove
Cloud Operations Platform Ops Engineer

Adobe Systems Incorporated

  Posted Yesterday

VIEW JOBS 5/19/2018 12:00:00 AM 2018-08-17T00:00 Position: Cloud Operations Platform Ops Engineer Team: Cloud Operations Location: San Jose, San Francisco, Seattle, New York Reporting: Senior Manager, Cloud Operations Job Responsibilities: Cloud Ops forms part of Cloud Services that has an exciting and challenging mission: Build, deploy, operate, scale and maintain company-wide platforms (PlaaS) for customer facing Adobe SaaS solutions. While various development groups focus on building our platforms, Cloud Ops provides operational/engineering support for both the platform as well as the product teams that leverage the platforms. Responsibilities will range from systems engineering, release management, and automation orchestration. The candidate should have an expert background with running production services at scale, some programming aptitude, experience with a formal software release process, and the ability to learn quickly. Experience supporting a customer facing production environment is required, as you will be a key member of the team responsible for delivering a world class online service to a fast-growing customer base. This role will work with the various Adobe Cloud engineering teams and will report to the Manager of Cloud Operations. Platform Operations: Your personal goals and objectives for Platform Operations will be set by the Manager for Platform Operations. The success of Platform Operations is measured by SLAs established between Platform Operations and the Adobe Cloud product teams. Major responsibilities of Platform Operations include partnering with Engineering and QE on major and dot releases; perform all releases; deploy emergency patches; ensure all environments are appropriately monitored; proactively maintain the overall health of all environments; and to respond to system outages according to standard operating procedures. Areas of Responsibility: * Ensure the highest level of uptime and Quality of Service (QoS) for our customers through operational excellence. * Support and maintain global application production environments. * Automate common, repeatable tasks at large scale. * Install, configure, and upgrade custom and packaged applications. * Be an escalation point for all production issues during shift or as required. * Guide and promote timely team responses to all SEV3, SEV2, and SEV1 (CSO) incidents. * Design and maintain production monitoring systems. * Troubleshoot performance and stability issues using a wide variety of tools. * Evaluate and manage application and environment security. * Follow change management processes during implementations. * Use and maintain version control for application infrastructure. * Work in a diverse and global team environment. * Cross-train with other global team members. * Seek opportunities to streamline standard operating procedures through automation. * Promote the DevOps and SRE mindset. Job Requirements: * University degree (BS/MS) in Computer Science, Electrical Engineering or equivalent experience * 5 or more years of experience in Linux administration in an internet focused production environment * Hands-on experience with multiple, diverse technologies and processing environments (including but not limited to system and network administration, 24/7 intervention on Internet and SaaS products, remote troubleshooting on J2EE, LAMP) * Thorough understanding of networking concepts and Internet protocols * Production experience with cloud-computing stacks such as AWS or Microsoft Azure * Expert knowledge of using log management tools such as Splunk * Production experience with Mesos/Marathon or DC/OS * Production experience with container solutions such as Docker * Advanced analytical and technical experience * Ability to hit the ground running and provide value immediately * Willingness to work in a fast-paced environment Desirable: * Experience with configuration management tools such as Chef or Puppet * Proficiency in major scripting languages (shell,python,PHP,Perl,Ruby). * Familiarity with implementing technical solutions to compliance requirements outlined in the PCI-DSS, SOC2, and ISO 27001. Qualities: * Team player. * Self-starter requiring minimal supervision. * Ability to learn quickly and adapt to changing priorities and requirements. * Excellent communication and prioritization skills. At Adobe, you will be immersed in an exceptional work environment that is recognized throughout the world on Best Companies lists. You will also be surrounded by colleagues who are committed to helping each other grow through our unique Check-In approach where ongoing feedback flows freely. If you're looking to make an impact, Adobe's the place for you. Discover what our employees are saying about their career experiences on the Adobe Life blog and explore the meaningful benefits we offer. Adobe is an equal opportunity employer. We welcome and encourage diversity in the workplace regardless of race, gender, religion, age, sexual orientation, gender identity, disability or veteran status. Adobe Systems Incorporated New York NY

L3 Ops - Server Operations

Morgan Stanley