RE evolution road show

**The Evolution of the Road** **Handmade Age** In the early days, our frontend was built on a 4-layer load balancing system. Static resources were cached through Varnish or Squid, while dynamic requests ran under the LAMP architecture. At that time, there were very few machines, and the number of processes required was minimal. There was no clear distinction between application operations and system maintenance. The number of operators was also small, with each person responsible for network, machines, and services. Most of the work was done manually, and there was no formal operation and maintenance system in place. Many startups today still follow this kind of architecture. **Cloud Infrastructure** As the business evolved, so did our architecture. Especially after entering the mobile era, the proportion of mobile traffic increased significantly. The access layer was no longer just about web resources but also included numerous API services. Backend development languages expanded beyond PHP to include Java, Python, C++, and others, based on service requirements. This marked the beginning of a shift toward microservices. As the business architecture changed, the underlying infrastructure followed suit. By mid-2014, all business had already moved to the cloud, as shown below. ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) One of the key benefits of running on the cloud is the abstraction of the underlying host and network. This means that the cloud platform encapsulates tasks such as host creation and network policy modification into a unified system, offering users a consistent interface. Maintenance became more streamlined as previously complex processes were simplified. During this period, the SRE team was established, and operations were divided into different areas. The cloud computing team (Meituan Cloud) focused on hosts, networks, and systems, while SRE teams worked closely with service teams to manage the machine environment, optimize architecture, and resolve business-related issues. **Problems & Solutions** Next, we will introduce the problems we encountered and some solutions during the construction of our cloud infrastructure. ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) As shown in the figure above, one of the main issues was resource isolation. Due to shared CPU and network cards among online VMs, high pressure testing caused the host's network bandwidth to be fully utilized, leading to service failures. For example, when the traffic from a test VM spiked, it affected other VMs on the same host, causing critical services to hang. To address this, we implemented two measures: isolating network resources with quotas per VM and separating host clusters based on business characteristics. Offline services were placed in separate clusters, while online services were divided into smaller clusters according to their importance. Another issue was VM fragmentation. Initially, services were deployed in large clusters, which made the impact of a single host failure less severe. However, as services were split into smaller units, a single host failure could take down half of a serviceâ€™s capacity, especially during peak times. After extensive optimization over six months, the VM break rate was reduced to over 90%, ensuring that no more than two VMs of the same service were on the same host. A third challenge was improving scheduling success rates. Through collaboration between SRE and cloud computing teams, the success rate reached 3-9. **Cloud Infrastructure** ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) The diagram above shows our cloud infrastructure network. The public network entrance primarily uses BGP links, while multi-datacenter high-speed lines ensure stability. These have been tested through large-scale online businesses like food delivery, group buying, and travel. The network is highly redundant, with each node having backup devices to prevent service disruption. Self-developed components like MGW and NAT allow for more flexible traffic control. The US Group is one of the largest users of Meituan Cloud. Benefits include perfect API support, customized resource isolation, multi-datacenter fiber connections, and higher resource utilization. **Operation and Maintenance Automation** With the rapid growth of orders and machines, automation became essential for efficient operations. During the automation evolution, we developed our own methodology. Complex tasks were simplified by using the cloud platform to manage basic devices, exposing only the interface or web interface. Simple tasks were standardized through naming conventions, system environments, and operational procedures. These standards were refined through real-world practice and eventually formed a unified standard. Once standards were set, processes were introduced, such as creating machines, following SOPs, and eventually automating the process. Manual tasks were replaced with code, achieving full automation. ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) This is the service tree, which maps cloud hosts, services, and service leaders. It provides a hierarchical view and integrates multiple peripheral systems through service tags. Currently, we have a configuration management system, capacity system, monitoring platform, and access control for online hosts. Additionally, cost accounting has been integrated into the service tree, allowing users to view the cost of each business group simply by navigating the service nodes. ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) The image above illustrates the machine creation process. A technician initiates the request, which is then processed in the process center. The service tree provides service information, which is sent to the operations platform. The platform then creates the machine via the cloud platform. After creation, the machine is added to the service node, and the configuration management system initializes the environment. Monitoring data is automatically added, and the deployment system deploys the service. Finally, the service is registered on the service governance platform and becomes available. This entire process can be completed automatically once initiated. **Data Operations** ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) As shown in the figure, the company has grown significantly, and weâ€™ve made corresponding structural changes. The red section represents the cloud platform, covering everything from the access layer to the infrastructure. The middle layer is managed by SRE. With the completion of the process system, our next focus is on data operations. Fault management is one of the first steps, providing unified tracking of online faults, including time, cause, and responsible person, classified by severity. We continue to improve post-failure handling to ensure every task is completed. Through the fault platform, we analyze all incidents, classify them, and identify trends to implement targeted improvements. We also conduct data mining, analyzing service metrics like request volume and response time to gain insights. **Responsibility & Mission** ![Cloud SRE Development and Practice](http://i.bosscdn.com/blog/0///) As shown, our mission has shifted from fire-fighting to proactive change and stability. Through data operations, we drive business improvements, with stability at the core of our work. Operation and maintenance involves improving service quality through experience and data analysis, while maintenance addresses online service needs using professional technology. Letâ€™s discuss the practice of stability protection. **Business Stability Guarantee Practice** **Failure Cause & Example** First, we summarize the causes of failures and provide examples. 1. **Change**: The US Group performs over 300 daily releases. Even minor changes, like an Nginx configuration, can cause issues. For example, changing the order of rewrite and set directives led to a service collapse. Without a proper gray release, the problem went unnoticed until it impacted production. 2. **Capacity**: Large events or traffic spikes can overwhelm systems. In one case, a holiday event caused a fivefold increase in traffic, exceeding the backend capacity. We learned to understand our capacity limits and monitor incoming traffic, adjusting accordingly. 3. **Hidden Dangers**: System design flaws, cross-component calls, and asymmetric link capacities are hard to detect. One incident involved session loss due to a network change, causing service downtime. We now conduct full-link exercises, set SLA targets, and manage faults effectively. **Experience Summary** The core of accident prevention lies in standard SOPs, capacity assessment, and stress testing. Fast recovery is crucialâ€”having a solid emergency plan helps minimize damage. Communication is also vital; clear feedback ensures effective problem-solving. Post-accident, we ensure lessons are learned and repeated issues are avoided. **User Experience Optimization** Starting from the user side, traffic flows from the public network to the private cloud and finally to the server. Public network issues like hijacking and multi-operator environments are addressed through BGP and HTTP DNS. We are also exploring SPDY and HTTP/2 protocols to improve performance. **Future Prospects** Technically, we are focusing on automation and moving towards intelligence. AI algorithms are being tested for automatic fault detection. Product-wise, our tools are being productized for broader use. Finally, we aim to provide cutting-edge technical references for US Mission users. The cloud is the future, encapsulating many low-level issues, allowing us to focus on more important tasks.

10 M Light Tower

A 10 M light tower typically refers to a mobile light tower that is 10 meters in height. These towers are commonly used in construction, mining, and outdoor events to provide lighting in areas where there is limited or no access to electricity.

The tower is equipped with multiple high-intensity lights that are mounted at the top and can be adjusted to provide wide coverage. The lights are usually powered by a generator or battery pack that is housed within the tower.

The tower itself is usually mounted on a trailer or skid for easy transportation and mobility. It can be easily moved to different locations as needed and set up quickly.

The 10 M light tower is designed to provide bright and efficient lighting for large areas, making it ideal for nighttime construction work, outdoor events, or emergency situations. It is often used in conjunction with other equipment, such as cranes or generators, to provide a complete lighting solution.

Overall, a 10 M light tower is a versatile and practical lighting solution for various industries and applications.

10 M Light Tower,High Mast Mobile Light Tower,Construction Lighting Tower,Kubota Engine Lighting Towers

Grandwatt Electric Corp. , https://www.grandwattelectric.com