hero

Jobs in the Indiana Uplands

The Indiana Uplands is a destination for opportunity. Find your place to thrive in our 11-county region.

SRE Metrics Analyst Intern

Leidos

Leidos

IT
Remote
USD 48,100-86,950 / year
Posted on Dec 27, 2025

We are seeking a detail-oriented and analytical SRE Metrics Analyst Intern to join our Site Reliability Engineering (SRE) team. In this role, you will be responsible for establishing and managing the collection of metrics related to system performance, reliability, and incidents. You will develop and maintain reporting frameworks to provide actionable insights to stakeholders, driving improvements in our systems and processes. Your work will support the organization’s commitment to delivering high-quality, reliable services.

This role is 50% telework and candidates must be local to the following cities:

Norfolk, VA

Jacksonville, FL

Bremerton, WA

San Diego, CA

Key Responsibilities:

Metrics Collection Framework:

· Design and implement a comprehensive metrics collection framework that captures key performance indicators (KPIs) related to system reliability and operational efficiency.

· Identify relevant metrics and establish methods for collecting, aggregating, and storing data from various sources, including monitoring tools, logs, and databases.

Data Analysis and Visualization:

· Analyze collected metrics to identify trends, patterns, and anomalies that impact system reliability and performance.

· Develop dashboards and visualizations to present data in a clear and actionable manner using tools such as Grafana, Kibana, or Tableau.

· Ensure that stakeholders have access to real-time insights and reports that inform decision-making.

Reporting:

· Create regular reports on system performance, reliability, incident response times, and other critical metrics for various stakeholders, including technical teams and management.

· Provide insights and recommendations based on data analysis to drive continuous improvement initiatives.

· Prepare and present findings to stakeholders, facilitating discussions on reliability goals and performance enhancements.

Collaboration with SRE Teams:

· Work closely with SRE teams to identify their metric needs and ensure alignment with operational goals.

· Collaborate with engineering and operations teams to ensure that metric collection is integrated into development and deployment processes.

· Support incident response efforts by providing metrics that help identify root causes and areas for improvement.

Continuous Improvement:

Stay current with industry trends and best practices related to metrics collection, monitoring, and reporting within SRE and DevOps.

Continuously evaluate and enhance the metrics collection and reporting processes to improve data accuracy, relevance, and accessibility.

Foster a culture of data-driven decision-making within the SRE team and broader organization.

Key Qualifications:

Enrolled in a degree program in a related major - GPA 3.0 or better

US citizenship required

Ability to obtain and maintain a DoD security clearance

Experience:

Experience in metrics collection, data analysis, or reporting, preferably in a Site Reliability Engineering or DevOps environment.

Proven experience in working with monitoring and observability tools (e.g., Prometheus, Datadog, New Relic).

Technical Skills:

Strong understanding of key metrics used in site reliability engineering, including SLIs, SLOs, and SLAs.

Proficiency in data analysis tools and languages (e.g., SQL, Python, R) for data manipulation and reporting.

Experience with data visualization tools (e.g., Grafana, Kibana, Tableau) to create dashboards and reports.

Analytical Skills:

Strong analytical and problem-solving skills, with the ability to interpret complex data sets and provide actionable insights.

Ability to evaluate the relevance and accuracy of metrics and make recommendations for improvement.

Communication and Collaboration:

Excellent communication skills, both written and verbal, with the ability to present data and findings to technical and non-technical audiences.

Proven ability to work collaboratively with cross-functional teams and build strong relationships with stakeholders.

Preferred Qualifications:

Experience with cloud platforms (AWS, GCP, Azure) and their monitoring tools.

Familiarity with incident management processes and practices within an SRE context.

Knowledge of software development methodologies and best practices.

Key Metrics of Success:

Timely and accurate collection of key performance metrics with minimal data discrepancies.

Effective visualization and reporting of metrics that inform decision-making and drive improvements in reliability.

Positive feedback from stakeholders regarding the clarity and usefulness of reports and insights.

Continuous improvement in the SRE metrics collection and reporting processes, leading to better operational performance.

Why Join Us?

Be part of a dynamic and innovative team focused on enhancing the reliability and performance of critical systems. Play a key role in shaping the metrics strategy that drives operational excellence and continuous improvement. Work in an environment that values collaboration, professional development, and a commitment to quality. Contribute to the success of the organization by providing actionable insights that improve system reliability and performance.

Summary:

The SRE Metrics Analyst Intern is crucial for ensuring that the Site Reliability Engineering team has the data and insights needed to maintain and improve system reliability. This role requires a blend of technical expertise, analytical skills, and effective communication to drive data-driven decision-making and enhance operational performance. The ideal candidate will have a strong background in metrics collection, data analysis, and reporting, along with a passion for supporting the organization’s reliability goals.

Come break things (in a good way). Then build them smarter.

We're the tech company everyone calls when things get weird. We don’t wear capes (they’re a safety hazard), but we do solve high-stakes problems with code, caffeine, and a healthy disregard for “how it’s always been done.”

Original Posting:

December 26, 2025

For U.S. Positions: While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.

Pay Range:

Pay Range $48,100.00 - $86,950.00

The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.