文档库 最新最全的文档下载
当前位置:文档库 › Hadoop分布式文件系统:架构和设计外文翻译

Hadoop分布式文件系统:架构和设计外文翻译

Hadoop分布式文件系统:架构和设计外文翻译
Hadoop分布式文件系统:架构和设计外文翻译

外文翻译

原文来源The Hadoop Distributed File System: Architecture and Design 中文译文Hadoop分布式文件系统:架构和设计

姓名 XXXX

学号 200708202137

2013年4月8 日

英文原文

The Hadoop Distributed File System: Architecture and Design

Source:https://www.wendangku.net/doc/2711772030.html,/docs/r0.18.3/hdfs_design.html Introduction

The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant. HDFS is highly fault-tolerant and is designed to be deployed on

low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large data sets. HDFS relaxes a few POSIX requirements to enable streaming access to file system data. HDFS was originally built as infrastructure for the Apache Nutch web search engine project. HDFS is part of the Apache Hadoop Core project. The project URL is

https://www.wendangku.net/doc/2711772030.html,/core/.

Assumptions and Goals

Hardware Failure

Hardware failure is the norm rather than the exception. An HDFS instance may consist of hundreds or thousands of server machines, each storing part of the file system’s data. The fact that there are a huge number of components and that each component has a non-trivial probability of failure means that some component of HDFS is always non-functional. Therefore, detection of faults and quick, automatic recovery from them is a core architectural goal of HDFS.

Streaming Data Access

Applications that run on HDFS need streaming access to their data sets. They are not general purpose applications that typically run on general purpose file systems. HDFS is designed more for batch processing rather than interactive use by users. The emphasis is on high throughput of data access rather than low latency of data access. POSIX imposes many hard requirements that are not

needed for applications that are targeted for HDFS. POSIX semantics in a few key areas has been traded to increase data throughput rates.

Large Data Sets

Applications that run on HDFS have large data sets. A typical file in HDFS is gigabytes to terabytes in size. Thus, HDFS is tuned to support large files. It should provide high aggregate data bandwidth and scale to hundreds of nodes in a single cluster. It should support tens of millions of files in a single instance.

Simple Coherency Model

HDFS applications need a write-once-read-many access model for files. A file once created, written, and closed need not be changed. This assumption simplifies data coherency issues and enables high throughput data access. A

Map/Reduce application or a web crawler application fits perfectly with this model. There is a plan to support appending-writes to files in the future.

“Moving Computation is Cheaper than Moving Data”

A computation requested by an application is much more efficient if it is executed near the data it operates on. This is especially true when the size of the data set is huge. This minimizes network congestion and increases the overall throughput of the system. The assumption is that it is often better to migrate the computation closer to where the data is located rather than moving the data to where the application is running. HDFS provides interfaces for applications to move themselves closer to where the data is located.

Portability Across Heterogeneous Hardware and Software Platforms

HDFS has been designed to be easily portable from one platform to another. This facilitates widespread adoption of HDFS as a platform of choice for a large set of applications.

NameNode and DataNodes

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on. HDFS exposes a file system namespace and allows user data to be stored in files. Internally, a file is split into one or more blocks and these blocks

are stored in a set of DataNodes. The NameNode executes file system namespace operations like opening, closing, and renaming files and directories. It also determines the mapping of blocks to DataNodes. The DataNodes are responsible for serving read and write requests from the file system’s clients. The DataNodes also perform block creation, deletion, and replication upon instruction from the NameNode.

The NameNode and DataNode are pieces of software designed to run on commodity machines. These machines typically run a GNU/Linux operating system (OS). HDFS is built using the Java language; any machine that supports Java can run the NameNode or the DataNode software. Usage of the highly portable Java language means that HDFS can be deployed on a wide range of

machines. A typical deployment has a dedicated machine that runs only the NameNode software. Each of the other machines in the cluster runs one instance of the DataNode software. The architecture does not preclude running multiple DataNodes on the same machine but in a real deployment that is rarely the case.

The existence of a single NameNode in a cluster greatly simplifies the architecture of the system. The NameNode is the arbitrator and repository for all HDFS metadata. The system is designed in such a way that user data never flows through the NameNode.

The File System Namespace

HDFS supports a traditional hierarchical file organization. A user or an application can create directories and store files inside these directories. The file system namespace hierarchy is similar to most other existing file systems; one can create and remove files, move a file from one directory to another, or rename a file. HDFS does not yet implement user quotas or access permissions. HDFS does not support hard links or soft links. However, the HDFS architecture does not preclude implementing these features.

The NameNode maintains the file system namespace. Any change to the file system namespace or its properties is recorded by the NameNode. An application can specify the number of replicas of a file that should be maintained by HDFS. The number of copies of a file is called the replication factor of that file. This information is stored by the NameNode.

Data Replication

HDFS is designed to reliably store very large files across machines in a large cluster. It stores each file as a sequence of blocks; all blocks in a file except the last block are the same size. The blocks of a file are replicated for fault tolerance. The block size and replication factor are configurable per file. An application can specify the number of replicas of a file. The replication factor can be specified at file creation time and can be changed later. Files in HDFS are write-once and have strictly one writer at any time.

The NameNode makes all decisions regarding replication of blocks. It periodically receives a Heartbeat and a Blockreport from each of the DataNodes in the cluster.

Receipt of a Heartbeat implies that the DataNode is functioning properly. A Blockreport contains a list of all blocks on a DataNode.

Replica Placement: The First Baby Steps

The placement of replicas is critical to HDFS reliability and performance. Optimizing replica placement distinguishes HDFS from most other distributed file systems. This is a feature that needs lots of tuning and experience. The purpose of a rack-aware replica placement policy is to improve data reliability, availability, and network bandwidth utilization. The current implementation for the replica placement policy is a first effort in this direction. The short-term goals of implementing this policy are to validate it on production systems, learn more about its behavior, and build a foundation to test and research more sophisticated policies.

Large HDFS instances run on a cluster of computers that commonly spread across many racks. Communication between two nodes in different racks has to go through switches. In most cases, network bandwidth between machines in the same rack is greater than network bandwidth between machines in different racks.

The NameNode determines the rack id each DataNode belongs to via the process outlined in Rack Awareness. A simple but non-optimal policy is to place replicas on unique racks. This prevents losing data when an entire rack fails and allows use of bandwidth from multiple racks when reading data. This policy evenly distributes replicas in the cluster which makes it easy to balance load on component failure. However, this policy increases the cost of writes because a write needs to transfer blocks to multiple racks.

For the common case, when the replication factor is three, HDFS’s placement policy is to put one replica on one node in the local rack, another on a different node in the local rack, and the last on a different node in a different rack. This policy cuts the inter-rack write traffic which generally improves write performance. The chance of rack failure is far less than that of node failure; this policy does not impact data reliability and availability guarantees. However, it does reduce the aggregate network bandwidth used when reading data since a block is placed in only two unique racks rather than three. With this policy, the replicas of a file do not evenly distribute across the racks. One third of replicas are on one node, two thirds of replicas are on one rack, and the other third are evenly distributed across the remaining racks. This policy improves write performance without compromising data reliability or read performance.

The current, default replica placement policy described here is a work in progress. Replica Selection

To minimize global bandwidth consumption and read latency, HDFS tries to satisfy a read request from a replica that is closest to the reader. If there exists a replica on the same rack as the reader node, then that replica is preferred to satisfy the read request. If angg/ HDFS cluster spans multiple data centers, then a replica that is resident in the local data center is preferred over any remote replica.

Safemode

On startup, the NameNode enters a special state called Safemode. Replication of data blocks does not occur when the NameNode is in the Safemode state. The NameNode receives Heartbeat and Blockreport messages from the DataNodes. A Blockreport contains the list of data blocks that a DataNode is hosting. Each block has a specified minimum number of replicas. A block is considered safely replicated when the minimum number of replicas of that data block has checked in with the NameNode. After a configurable percentage of safely replicated data blocks checks in with the NameNode (plus an additional 30 seconds), the NameNode exits the Safemode state. It then determines the list of data blocks (if any) that still have fewer than the specified number of replicas. The NameNode then replicates these blocks to other DataNodes.

The Persistence of File System Metadata

The HDFS namespace is stored by the NameNode. The NameNode uses a transaction log called the EditLog to persistently record every change that occurs to file system metadata. For example, creating a new file in HDFS causes the NameNode to insert a record into the EditLog indicating this. Similarly, changing the replication factor of a file causes a new record to be inserted into the EditLog. The NameNode uses a file in its local host OS file system to store the EditLog. The entire file system namespace, including the mapping of blocks to files and file system properties, is stored in a file called the FsImage. The FsImage is stored as a file in the NameNode’s local file system too.

The NameNode keeps an image of the entire file system namespace and file Blockmap in memory. This key metadata item is designed to be compact, such that a NameNode with 4 GB of RAM is plenty to support a huge number of files and directories. When the NameNode starts up, it reads the FsImage and EditLog from disk, applies all the transactions from the EditLog to the in-memory representation of the FsImage, and flushes out this new version into a new FsImage on disk. It can then truncate the old EditLog because its transactions have been applied to the persistent FsImage. This process is called a checkpoint. In the current implementation, a checkpoint only occurs when the NameNode starts up. Work is in progress to support periodic checkpointing in the near future.

The DataNode stores HDFS data in files in its local file system. The DataNode has no knowledge about HDFS files. It stores each block of HDFS data in a separate

file in its local file system. The DataNode does not create all files in the same directory. Instead, it uses a heuristic to determine the optimal number of files per directory and creates subdirectories appropriately. It is not optimal to create all local files in the same directory because the local file system might not be able to efficiently support a huge number of files in a single directory. When a DataNode starts up, it scans through its local file system, generates a list of all HDFS data blocks that correspond to each of these local files and sends this report to the NameNode: this is the Blockreport.

The Communication Protocols

All HDFS communication protocols are layered on top of the TCP/IP protocol. A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode. The DataNodes talk to the NameNode using the DataNode Protocol. A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol. By design, the NameNode never initiates any RPCs. Instead, it only responds to RPC requests issued by DataNodes or clients.

Robustness

The primary objective of HDFS is to store data reliably even in the presence of failures. The three common types of failures are NameNode failures, DataNode failures and network partitions.

Data Disk Failure, Heartbeats and Re-Replication

Each DataNode sends a Heartbeat message to the NameNode periodically. A network partition can cause a subset of DataNodes to lose connectivity with the NameNode. The NameNode detects this condition by the absence of a Heartbeat message. The NameNode marks DataNodes without recent Heartbeats as dead and does not forward any new IO requests to them. Any data that was registered to a dead DataNode is not available to HDFS any more. DataNode death may cause the replication factor of some blocks to fall below their specified value. The NameNode constantly tracks which blocks need to be replicated and initiates replication whenever necessary. The necessity for re-replication may arise due to many reasons: a DataNode may become unavailable, a replica may become corrupted, a hard disk on a DataNode may fail, or the replication factor of a file may be increased.

Cluster Rebalancing

The HDFS architecture is compatible with data rebalancing schemes. A scheme might automatically move data from one DataNode to another if the free space on a DataNode falls below a certain threshold. In the event of a sudden high demand for a particular file, a scheme might dynamically create additional replicas and rebalance other data in the cluster. These types of data rebalancing schemes are not yet implemented.

Data Integrity

It is possible that a block of data fetched from a DataNode arrives corrupted. This corruption can occur because of faults in a storage device, network faults, or buggy software. The HDFS client software implements checksum checking on the contents of HDFS files. When a client creates an HDFS file, it computes a checksum of each block of the file and stores these checksums in a separate hidden file in the same HDFS namespace. When a client retrieves file contents it verifies that the data it received from each DataNode matches the checksum stored in the associated checksum file. If not, then the client can opt to retrieve that block from another DataNode that has a replica of that block.

Metadata Disk Failure

The FsImage and the EditLog are central data structures of HDFS. A corruption of these files can cause the HDFS instance to be non-functional. For this reason, the NameNode can be configured to support maintaining multiple copies of the FsImage and EditLog. Any update to either the FsImage or EditLog causes each of the FsImages and EditLogs to get updated synchronously. This synchronous updating of multiple copies of the FsImage and EditLog may degrade the rate of namespace transactions per second that a NameNode can support. However, this degradation is acceptable because even though HDFS applications are very data intensive in nature, they are not metadata intensive. When a NameNode restarts, it selects the latest consistent FsImage and EditLog to use.

The NameNode machine is a single point of failure for an HDFS cluster. If the NameNode machine fails, manual intervention is necessary. Currently, automatic restart and failover of the NameNode software to another machine is not supported.

Snapshots

Snapshots support storing a copy of data at a particular instant of time. One usage of the snapshot feature may be to roll back a corrupted HDFS instance to a previously known good point in time. HDFS does not currently support snapshots but will in a future release.

Data Organization

Data Blocks

HDFS is designed to support very large files. Applications that are compatible with HDFS are those that deal with large data sets. These applications write their data only once but they read it one or more times and require these reads to be satisfied at streaming speeds. HDFS supports write-once-read-many semantics on files. A typical block size used by HDFS is 64 MB. Thus, an HDFS file is chopped up into 64 MB chunks, and if possible, each chunk will reside on a different DataNode.

Staging

A client request to create a file does not reach the NameNode immediately. In fact, initially the HDFS client caches the file data into a temporary local file. Application writes are transparently redirected to this temporary local file. When the local file accumulates data worth over one HDFS block size, the client contacts the NameNode. The NameNode inserts the file name into the file system hierarchy and allocates a data block for it. The NameNode responds to the client request with the identity of the DataNode and the destination data block. Then the client flushes the block of data from the local temporary file to the specified DataNode. When a file is closed, the remaining un-flushed data in the temporary local file is transferred to the DataNode. The client then tells the NameNode that the file is closed. At this point, the NameNode commits the file creation operation into a persistent store. If the NameNode dies before the file is closed, the file is lost.

The above approach has been adopted after careful consideration of target applications that run on HDFS. These applications need streaming writes to files. If a client writes to a remote file directly without any client side buffering, the network speed and the congestion in the network impacts throughput considerably. This approach is not without precedent. Earlier distributed file systems, e.g. AFS, have used client side caching to improve performance. A

POSIX requirement has been relaxed to achieve higher performance of data uploads.

Replication Pipelining

When a client is writing data to an HDFS file, its data is first written to a local file as explained in the previous section. Suppose the HDFS file has a replication factor of three. When the local file accumulates a full block of user data, the client retrieves a list of DataNodes from the NameNode. This list contains the DataNodes that will host a replica of that block. The client then flushes the data block to the first DataNode. The first DataNode starts receiving the data in small portions (4 KB), writes each portion to its local repository and transfers that portion to the second DataNode in the list. The second DataNode, in turn starts receiving each portion of the data block, writes that portion to its repository and then flushes that portion to the third DataNode. Finally, the third DataNode writes the data to its local repository. Thus, a DataNode can be receiving data from the previous one in the pipeline and at the same time forwarding data to the next one in the pipeline. Thus, the data is pipelined from one DataNode to the next.

Accessibility

HDFS can be accessed from applications in many different ways. Natively, HDFS provides a Java API for applications to use. A C language wrapper for this Java API is also available. In addition, an HTTP browser can also be used to browse the files of an HDFS instance. Work is in progress to expose HDFS through the WebDAV protocol.

FS Shell

HDFS allows user data to be organized in the form of files and directories. It provides a commandline interface called FS shell that lets a user interact with the data in HDFS. The syntax of this command set is similar to other shells (e.g. bash, csh) that users are already familiar with. Here are some sample action/command pairs:

FS shell is targeted for applications that need a scripting language to interact with the stored data.

DFSAdmin

The DFSAdmin command set is used for administering an HDFS cluster. These are commands that are used only by an HDFS administrator. Here are some sample action/command pairs:

Browser Interface

A typical HDFS install configures a web server to expose the HDFS namespace through a configurable TCP port. This allows a user to navigate the HDFS namespace and view the contents of its files using a web browser.

Space Reclamation

File Deletes and Undeletes

When a file is deleted by a user or an application, it is not immediately removed from HDFS. Instead, HDFS first renames it to a file in the /trash directory. The file can be restored quickly as long as it remains in /trash. A file remains in

/trash for a configurable amount of time. After the expiry of its life in /trash, the NameNode deletes the file from the HDFS namespace. The deletion of a file causes the blocks associated with the file to be freed. Note that there could be an appreciable time delay between the time a file is deleted by a user and the time of the corresponding increase in free space in HDFS.

A user can Undelete a file after deleting it as long as it remains in the /trash directory. If a user wants to undelete a file that he/she has deleted, he/she can navigate the /trash directory and retrieve the file. The /trash directory contains only the latest copy of the file that was deleted. The /trash directory is just like any other directory with one special feature: HDFS applies specified policies to automatically delete files from this directory. The current default policy is to delete files from /trash that are more than 6 hours old. In the future, this policy will be configurable through a well defined interface.

Decrease Replication Factor

When the replication factor of a file is reduced, the NameNode selects excess replicas that can be deleted. The next Heartbeat transfers this information to the DataNode. The DataNode then removes the corresponding blocks and the corresponding free space appears in the cluster. Once again, there might be a time delay between the completion of the setReplication API call and the appearance of free space in the cluster.

中文译本

原文地址:https://www.wendangku.net/doc/2711772030.html,/docs/r0.18.3/hdfs_design.html

一、引言

Hadoop分布式文件系统(HDFS)被设计成适合运行在通用硬件(commodity hardware)上的分布式文件系统。它和现有的分布式文件系统有很多共同点。但同时,它和其他的分布式文件系统的区别也是很明显的。HDFS是一个高度容错性的系统,适合部署在廉价的机器上。HDFS能提供高吞吐量的数据访问,非常适合大规模数据集上的应用。HDFS放宽了一部分POSIX约束,来实现流式读取文件系统数据的目的。HDFS在最开始是作为Apache Nutch搜索引擎项目的基础架构而开发的。HDFS是Apache Hadoop Core项目的一部分。这个项目的地址是https://www.wendangku.net/doc/2711772030.html,/core/。

二、前提和设计目标

2.1 硬件错误

硬件错误是常态而不是异常。HDFS可能由成百上千的服务器所构成,每个服务器上存储着文件系统的部分数据。我们面对的现实是构成系统的组件数目是巨大的,而且任一组件都有可能失效,这意味着总是有一部分HDFS的组件是不工作的。因此错误检测和快速、自动的恢复是HDFS最核心的架构目标。

2.2 流式数据访问

运行在HDFS上的应用和普通的应用不同,需要流式访问它们的数据集。HDFS 的设计中更多的考虑到了数据批处理,而不是用户交互处理。比之数据访问的低延迟问题,更关键的在于数据访问的高吞吐量。POSIX标准设置的很多硬性约束对HDFS应用系统不是必需的。为了提高数据的吞吐量,在一些关键方面对POSIX 的语义做了一些修改。

2.3 大规模数据集

运行在HDFS上的应用具有很大的数据集。HDFS上的一个典型文件大小一般都在G字节至T字节。因此,HDFS被调节以支持大文件存储。它应该能提供整体上高的数据传输带宽,能在一个集群里扩展到数百个节点。一个单一的HDFS 实例应该能支撑数以千万计的文件。

2.4 简单的一致性模型

HDFS应用需要一个“一次写入多次读取”的文件访问模型。一个文件经过创建、写入和关闭之后就不需要改变。这一假设简化了数据一致性问题,并且使高吞吐量的数据访问成为可能。Map/Reduce应用或者网络爬虫应用都非常适合这个模型。目前还有计划在将来扩充这个模型,使之支持文件的附加写操作。

2.5 “移动计算比移动数据更划算”

一个应用请求的计算,离它操作的数据越近就越高效,在数据达到海量级别的时候更是如此。因为这样就能降低网络阻塞的影响,提高系统数据的吞吐量。将计算移动到数据附近,比之将数据移动到应用所在显然更好。HDFS为应用提供了将它们自己移动到数据附近的接口。

2.6异构软硬件平台间的可移植性

HDFS在设计的时候就考虑到平台的可移植性。这种特性方便了HDFS作为大规模数据应用平台的推广。

三、Namenode 和Datanode

HDFS采用master/slave架构。一个HDFS集群是由一个Namenode和一定数目的Datanodes组成。Namenode是一个中心服务器,负责管理文件系统的名字空间(namespace)以及客户端对文件的访问。集群中的Datanode一般是一个节点一个,负责管理它所在节点上的存储。HDFS暴露了文件系统的名字空间,用户能够以文件的形式在上面存储数据。从内部看,一个文件其实被分成一个或多个数据块,这些块存储在一组Datanode上。Namenode执行文件系统的名字空间操作,比如打开、关闭、重命名文件或目录。它也负责确定数据块到具体Datanode

节点的映射。Datanode负责处理文件系统客户端的读写请求。在Namenode的统一调度下进行数据块的创建、删除和复制。

Namenode和Datanode被设计成可以在普通的商用机器上运行。这些机器一般运行着GNU/Linux操作系统(OS)。HDFS采用Java语言开发,因此任何支持Java 的机器都可以部署Namenode或Datanode。由于采用了可移植性极强的Java语言,使得HDFS可以部署到多种类型的机器上。一个典型的部署场景是一台机器上只运行一个Namenode实例,而集群中的其它机器分别运行一个Datanode实例。这种架构并不排斥在一台机器上运行多个Datanode,只不过这样的情况比较少见。

集群中单一Namenode的结构大大简化了系统的架构。Namenode是所有HDFS 元数据的仲裁者和管理者,这样,用户数据永远不会流过Namenode。

四、文件系统的名字空间(namespace)

HDFS支持传统的层次型文件组织结构。用户或者应用程序可以创建目录,然后将文件保存在这些目录里。文件系统名字空间的层次结构和大多数现有的文件系统类似:用户可以创建、删除、移动或重命名文件。当前,HDFS不支持用户磁盘配额和访问权限控制,也不支持硬链接和软链接。但是HDFS架构并不排斥实现这些特性。

Namenode负责维护文件系统的名字空间,任何对文件系统名字空间或属性的修改都将被Namenode记录下来。应用程序可以设置HDFS保存的文件的副本数目。文件副本的数目称为文件的副本系数,这个信息也是由Namenode保存的。

五、数据复制

HDFS被设计成能够在一个大集群中跨机器可靠地存储超大文件。它将每个文件存储成一系列的数据块,除了最后一个,所有的数据块都是同样大小的。为了容错,文件的所有数据块都会有副本。每个文件的数据块大小和副本系数都是可配置的。应用程序可以指定某个文件的副本数目。副本系数可以在文件创建的时候指定,也可以在之后改变。HDFS中的文件都是一次性写入的,并且严格要求在任何时候只能有一个写入者。

Namenode全权管理数据块的复制,它周期性地从集群中的每个Datanode接收心跳信号和块状态报告(Blockreport)。接收到心跳信号意味着该Datanode

节点工作正常。块状态报告包含了一个该Datanode上所有数据块的列表。

5.1 副本存放: 最最开始的一步

副本的存放是HDFS可靠性和性能的关键。优化的副本存放策略是HDFS区分于其他大部分分布式文件系统的重要特性。这种特性需要做大量的调优,并需要经验的积累。HDFS采用一种称为机架感知(rack-aware)的策略来改进数据的可靠性、可用性和网络带宽的利用率。目前实现的副本存放策略只是在这个方向上的第一步。实现这个策略的短期目标是验证它在生产环境下的有效性,观察它的行为,为实现更先进的策略打下测试和研究的基础。

大型HDFS实例一般运行在跨越多个机架的计算机组成的集群上,不同机架上的两台机器之间的通讯需要经过交换机。在大多数情况下,同一个机架内的两台机器间的带宽会比不同机架的两台机器间的带宽大。

通过一个机架感知的过程,Namenode可以确定每个Datanode所属的机架id。一个简单但没有优化的策略就是将副本存放在不同的机架上。这样可以有效防止当整个机架失效时数据的丢失,并且允许读数据的时候充分利用多个机架的带宽。这种策略设置可以将副本均匀分布在集群中,有利于当组件失效情况下的负载均衡。但是,因为这种策略的一个写操作需要传输数据块到多个机架,这增加了写的代价。

在大多数情况下,副本系数是3,HDFS的存放策略是将一个副本存放在本地机架的节点上,一个副本放在同一机架的另一个节点上,最后一个副本放在不同机架的节点上。这种策略减少了机架间的数据传输,这就提高了写操作的效率。机架的错误远远比节点的错误少,所以这个策略不会影响到数据的可靠性和可用性。于此同时,因为数据块只放在两个(不是三个)不同的机架上,所以此策略减少了读取数据时需要的网络传输总带宽。在这种策略下,副本并不是均匀分布在不同的机架上。三分之一的副本在一个节点上,三分之二的副本在一个机架上,其他副本均匀分布在剩下的机架中,这一策略在不损害数据可靠性和读取性能的情况下改进了写的性能。

当前,这里介绍的默认副本存放策略正在开发的过程中。

5.2 副本选择

为了降低整体的带宽消耗和读取延时,HDFS会尽量让读取程序读取离它最近的副本。如果在读取程序的同一个机架上有一个副本,那么就读取该副本。如果一个HDFS集群跨越多个数据中心,那么客户端也将首先读本地数据中心的副本。

5.3 安全模式

Namenode启动后会进入一个称为安全模式的特殊状态。处于安全模式的Namenode是不会进行数据块的复制的。Namenode从所有的Datanode接收心跳信号和块状态报告。块状态报告包括了某个Datanode所有的数据块列表。每个数据块都有一个指定的最小副本数。当Namenode检测确认某个数据块的副本数目达到这个最小值,那么该数据块就会被认为是副本安全(safely replicated)的;在一定百分比(这个参数可配置)的数据块被Namenode检测确认是安全之后(加上一个额外的30秒等待时间),Namenode将退出安全模式状态。接下来它会确定还有哪些数据块的副本没有达到指定数目,并将这些数据块复制到其他Datanode上。

六、文件系统元数据的持久化

Namenode上保存着HDFS的名字空间。对于任何对文件系统元数据产生修改的操作,Namenode都会使用一种称为EditLog的事务日志记录下来。例如,在HDFS中创建一个文件,Namenode就会在Editlog中插入一条记录来表示;同样地,修改文件的副本系数也将往Editlog插入一条记录。Namenode在本地操作系统的文件系统中存储这个Editlog。整个文件系统的名字空间,包括数据块到文件的映射、文件的属性等,都存储在一个称为FsImage的文件中,这个文件也是放在Namenode所在的本地文件系统上。

Namenode在内存中保存着整个文件系统的名字空间和文件数据块映射(Blockmap)的映像。这个关键的元数据结构设计得很紧凑,因而一个有4G内存的Namenode足够支撑大量的文件和目录。当Namenode启动时,它从硬盘中读取Editlog和FsImage,将所有Editlog中的事务作用在内存中的FsImage上,并将这个新版本的FsImage从内存中保存到本地磁盘上,然后删除旧的Editlog,因为这个旧的Editlog的事务都已经作用在FsImage上了。这个过程称为一个检

查点(checkpoint)。在当前实现中,检查点只发生在Namenode启动时,在不久的将来将实现支持周期性的检查点。

Datanode将HDFS数据以文件的形式存储在本地的文件系统中,它并不知道有关HDFS文件的信息。它把每个HDFS数据块存储在本地文件系统的一个单独的文件中。Datanode并不在同一个目录创建所有的文件,实际上,它用试探的方法来确定每个目录的最佳文件数目,并且在适当的时候创建子目录。在同一个目录中创建所有的本地文件并不是最优的选择,这是因为本地文件系统可能无法高效地在单个目录中支持大量的文件。当一个Datanode启动时,它会扫描本地文件系统,产生一个这些本地文件对应的所有HDFS数据块的列表,然后作为报告发送到Namenode,这个报告就是块状态报告。

七、通讯协议

所有的HDFS通讯协议都是建立在TCP/IP协议之上。客户端通过一个可配置的TCP端口连接到Namenode,通过ClientProtocol协议与Namenode交互。而Datanode使用DatanodeProtocol协议与Namenode交互。一个远程过程调用(RPC)模型被抽象出来封装ClientProtocol和Datanodeprotocol协议。在设计上,Namenode不会主动发起RPC,而是响应来自客户端或 Datanode 的RPC请求。

八、健壮性

HDFS的主要目标就是即使在出错的情况下也要保证数据存储的可靠性。常见的三种出错情况是:Namenode出错, Datanode出错和网络割裂(network partitions)。

8.1 磁盘数据错误、心跳检测和重新复制

每个Datanode节点周期性地向Namenode发送心跳信号。网络割裂可能导致一部分Datanode跟Namenode失去联系。Namenode通过心跳信号的缺失来检测这一情况,并将这些近期不再发送心跳信号Datanode标记为dead,不会再将新的IO请求发给它们。任何存储在dead Datanode上的数据将不再有效。Datanode 的dead可能会引起一些数据块的副本系数低于指定值,Namenode不断地检测这

分布式汽车电气电子系统设计和实现架构

分布式汽车电气电子系统设计和实现 架构

分布式汽车电气/电子系统设计和实现架构在过去的十几年里,汽车的电气和电子系统已经变得非常的复杂。今天汽车电子/电气系统开发工程师广泛使用基于模型的功能设计与仿真来迎接这一复杂性挑战。新兴标准定义了与低层软件的标准化接口,最重要的是,它还为功能实现工程师引入了一个全新的抽象级。 这提高了软件组件的可重用性,但不幸的是,关于如何将基于模型的功能设计的结果转换成高度环境中的可靠和高效系统实现方面的指导却几乎没有。 另外,论述设计流程物理端的文章也非常少。本文概述了一种推荐的系统级设计方法学,包括、分布在多个ECU中的网络和任务调度、线束设计和规格生成。 为什么需要AUTOSAR? 即使在同一家公司,“架构设计”对不同的人也有不同的含义,这取决于她们站在哪个角度上。物理架构处理系统的有形一面,如布线和连接器,逻辑架构定义无形系统的结构和分配,如软件和通信协议。当前设计物理架构和逻辑架构的语言是独立的,这导致相同一个词的意思能够完全不同,设计团队和流程也是独立的,这也导致了一个非常复杂的设计流程(如图1所示)。

图1:物理和逻辑设计流程。 这种复杂性导致了次优设计结果,整个系统的正确功能是如此的难于实现,以致于几乎没有时间去寻求一种替代方法,它可导致更坚固的、可扩展性更好的和更具成本效益的解决方案。为了实现这样一种解决方案,设计师需要新的方法,它能够将物理和逻辑设计流程紧密相连,并依然允许不同的设计团队做她们的工作。 新兴的AUTOSAR标准为系统级汽车电子/电气设计方法学提供了一个技术上和经济上都可行的选择,尽管它主要针对软件层面,即逻辑系统的设计。不过,大量广泛的AUTOSAR元模型及其丰富的接口定义允许系统级电子/电气架构师以标准的格式表示她的设计思想。从经济上看,AUTOSAR标准打开了一个巨大的、统一的市场,它使得能够创立合适的设计工具。

控制系统基础论文中英文资料外文翻译文献

控制系统基础论文中英文资料外文翻译文献 文献翻译 原文: Numerical Control One of the most fundamental concepts in the area of advanced manufacturing technologies is numerical control (NC).Prior to the advent of NC, all machine tools were manual operated and controlled. Among the many limitations associated with manual control machine tools, perhaps none is more prominent than the limitation of operator skills. With manual control, the quality of the product is directly related to and limited to the skills of the operator . Numerical control represents the first major step away from human control of machine tools. Numerical control means the control of machine tools and other manufacturing systems though the use of prerecorded, written symbolic instructions. Rather than operating a machine tool, an NC technician writes a program that issues operational instructions to the machine tool, For a machine tool to be numerically controlled , it must be interfaced with a device for accepting and decoding the p2ogrammed instructions, known as a reader. Numerical control was developed to overcome the limitation of human operator , and it has done so . Numerical control machines are more accurate than manually operated machines , they can produce parts more uniformly , they are faster, and the long-run tooling costs are lower . The development of NC led to the development of several other innovations in manufacturing technology: 1.Electrical discharge machining. https://www.wendangku.net/doc/2711772030.html,ser cutting. 3.Electron beam welding.

电气工程及其自动化专业光伏单相逆变器并网控制技术研究 开题报告 文献综述 外文翻译

摘要 随着“绿色环保”概念的提出,以解决电力紧张,环境污染等问题为目的的新能源利用方案得到了迅速的推广,这使得研究可再生能源回馈电网技术具有了十分重要的现实意义。如何可靠地、高质量地向电网输送功率是一个重要的问题,因此在可再生能源并网发电系统中起电能变换作用的逆变器成为了研究的一个热点。 本文以全桥逆变器为对象,详细论述了基于双电流环控制的逆变器并网系统的工作原理,推导了控制方程。内环通过控制LCL滤波中的电容电流,外环控制滤波后的网侧电流。大功率并网逆变器的开关频率相对较低,相对于传统的L 型或LC 型滤波器,并网逆变器采用LCL 型输出滤波器具有输出电流谐波小,滤波器体积小的优点,在此基础上本系统设计了LCL滤波器。本文分析比较了单相逆变器并网采用单闭环和双闭环两种控制策略下的并网电流,并对突加扰动情况下系统动态变化进行了分析。 在完成并网控制系统理论分析的基础上,本文设计并制作了基于TMS320LF2407DSP的数字化控制硬件实验系统,包括DSP 外围电路、模拟量采样及调理电路、隔离驱动电路、保护电路和辅助电源等,最后通过MATLAB仿真软件进行验证理论的可行性,实现功率因数为1的并网要求。 关键词并网逆变器;LCL滤波器;双电流环控制;DSP

Abstract With the concept of”Green and Environmental Protection”was proposed.All kinds of new energy exploitation program are in the rapid promotion,which is in order to solve the power shortage,pollution and other issues.It makes exploring renewable energy feedback the grid technology has a very important practical significance.How to deliver power into the grid reliably and quality is an important problem,the inverter mat Can transform the electrical energy in the system of the renewable resource to be fed into the grid is becoming one of the hot points in intemational research. Based on the bridge inverter the analysis of the working principle and the deduction of the control equation have been presented. The strategy integrates an outer loop grid current regulator with capacitor current regulation to stabilize the system. The current regulation is used for the outer grid current control loop. The frequency of switching is slower in the high power grid-connected inverter. Compared with tradition type L or type LC, output filter and output current’s THD of type LCL are all smaller.So on this basis, the system uses the LCL filter. This paper compares the net current of the single-phase inverter and net single loop and double loop under two control strategies, and the case of sudden disturbance of the dynamic change of the system. In complete control system on the basis of theoretical analysis, design and production of this article is based on TMS320LF2407DSP’s digital control hardware test system, including the DSP external circuit, analog sampling and conditioning circuit, isolation, driver circuit, protection circuit and auxiliary power, etc., via MATLAB software to validate the feasibility of the theory.Achieve power factor is 1 and network requirements. Keywords Grid-connected inverter;LCL filter; Double current loop control; DSP

太阳能光伏发电外文翻译

毕业设计(论文)外文资料翻译 系:电气工程学院 专业:电气工程及其自动化专业 姓名:刘哲瑄 外文出处:University of Technology, Mauritius University of Mauritius B SeetanahAJ Khadaroo 学号: 2011316020526 : 附件:1.外文资料翻译译文;2.外文原文。

附件1:外文资料翻译译文 太阳能发电技术 ——光伏发电系统控制器 1 太阳能充放电控制器现状 1.1太阳能光伏发电 太阳能作为新能源有着巨大的优势,所以世界各国都在努力研发新技术进行获取比较成熟的是太阳能光伏发电技术。太阳能光伏发电现已成为新能源和可再生能源的重要组成部分,也被认为是当前世界最有发展前景的新能源技术。目前太阳能光伏发电装置已广泛应用于通讯、交通、电力等各个方面。 在进行太阳能光伏发电时,由于一般太阳能极板输出电压不稳定,不能直接将太阳能极板应用于负载,需要将太阳能转变为电能后存储到一定的储能设备中,如铅酸蓄电池。但只有当太阳能光伏发电系统工作过程中保持蓄电池没有过充电,也没有过放电,才能使蓄电池的使用寿命延长,效率也得以提高,因此必须对工作过程加以研究分析而予以控制,这种情况下太阳能充电控制器应运而生。 1.2充电控制器的作用及现状 太阳能充电控制器具备充电控制、过充保护、过放保护、防反接保护及短路保护等一系列功能,解决了这一难题,这样控制器在这个过程中起着枢纽作用,它控制太阳能极板对蓄电池的充电,加快蓄电池的充电速度,延长蓄电池的使用寿命。同时太阳能充放电控制器还控制蓄电池对负载的供电,保护蓄电池和负载电路,避免蓄电池发生过放现象,由此可见,控制器具有举足轻重的作用。 目前市场上有各种各样的太阳能控制器,但这些控制器主要问题对于蓄电池的保护不够充分,不合适的充放电方式容易导致蓄电池的损坏,使蓄电池的使用寿命降低。目前,控制器常用的蓄电池充电法包括三种;恒流充电法、阶段充电法和恒压充电法。但是这些方法由于充电方式单一加上控制策略不够完善,都存在一定的

分布式文件系统Hadoop HDFS与传统文件系统Linux FS的比较与分析

6苏州大学学报(工科版)第30卷 图1I-IDFS架构 2HDFS与LinuxFS比较 HDFS的节点不管是DataNode还是NameNode都运行在Linux上,HDFS的每次读/写操作都要通过LinuxFS的读/写操作来完成,从这个角度来看,LinuxPS是HDFS的底层文件系统。 2.1目录树(DirectoryTree) 两种文件系统都选择“树”来组织文件,我们称之为目录树。文件存储在“树叶”,其余的节点都是目录。但两者细节结构存在区别,如图2与图3所示。 一二 Root \ 图2ItDFS目录树围3LinuxFS目录树 2.2数据块(Block) Block是LinuxFS读/写操作的最小单元,大小相等。典型的LinuxFSBlock大小为4MB,Block与DataN-ode之间的对应关系是固定的、天然存在的,不需要系统定义。 HDFS读/写操作的最小单元也称为Block,大小可以由用户定义,默认值是64MB。Block与DataNode的对应关系是动态的,需要系统进行描述、管理。整个集群来看,每个Block存在至少三个内容一样的备份,且一定存放在不同的计算机上。 2.3索引节点(INode) LinuxFS中的每个文件及目录都由一个INode代表,INode中定义一组外存上的Block。 HDPS中INode是目录树的单元,HDFS的目录树正是在INode的集合之上生成的。INode分为两类,一类INode代表文件,指向一组Block,没有子INode,是目录树的叶节点;另一类INode代表目录,没有Block,指向一组子INode,作为索引节点。在Hadoop0.16.0之前,只有一类INode,每个INode都指向Block和子IN-ode,比现有的INode占用更多的内存空间。 2.4目录项(Dentry) Dentry是LinuxFS的核心数据结构,通过指向父Den姆和子Dentry生成目录树,同时也记录了文件名并 指向INode,事实上是建立了<FileName,INode>,目录树中同一个INode可以有多个这样的映射,这正是连

基于单片机的步进电机控制系统设计外文翻译

毕业设计(论文)外文资料翻译 学院:机械工程学院 专业:机械设计制造及其自动化 姓名: 学号:XXXXXXXXXX 外文出处:《Computational Intelligence and (用外文写)Design》 附件: 1.外文资料翻译译文;2.外文原文。 注:请将该封面与附件装订成册。

附件1:外文资料翻译译文 基于微型计算机的步进电机控制系统设计 孟天星余兰兰 山东理工大学电子与电气工程学院 山东省淄博市 摘要 本文详细地介绍了一种以AT89C51为核心的步进电机控制系统。该系统设计包括硬件设计、软件设计和电路设计。电路设计模块包括键盘输入模块、LED显示模块、发光二极管状态显示和报警模块。按键可以输入设定步进电机的启停、转速、转向,改变转速、转向等的状态参数。通过键盘输入的状态参数来控制步进电机的步进位置和步进速度进而驱动负载执行预订的工作。运用显示电路来显示步进电机的输入数据和运行状态。AT89C51单片机通过指令系统和编译程序来执行软件部分。通过反馈检测模块,该系统可以很好地完成上述功能。 关键词:步进电机,AT89C51单片机,驱动器,速度控制 1概述 步进电机因为具有较高的精度而被广泛地应用于运动控制系统,例如机器人、打印机、软盘驱动机、绘图仪、机械式阀体等等。过去传统的步进电机控制电路和驱动电路设计方法通常都极为复杂,由成本很高而且实用性很差的电器元件组成。结合微型计算机技术和软件编程技术的设计方法成功地避免了设计大量复杂的电路,降低了使用元件的成本,使步进电机的应用更广泛更灵活。本文步进电机控制系统是基于AT89C51单片机进行设计的,它具有电路简单、结构紧凑的特点,能进行加减速,转向和角度控制。它仅仅需要修改控制程序就可以对各种不同型号的步进电机进行控制而不需要改变硬件电路,所以它具有很广泛的应用领域。 2设计方案 该系统以AT89C51单片机为核心来控制步进电机。电路设计包括键盘输入电路、LED显示电路、发光二极管显示电路和报警电路,系统原理框图如图1所示。 At89c51单片机的P2口输出控制步进电机速度的时钟脉冲信号和控制步进电机运转方向的高低电平。通过定时程序和延时程序可以控制步进电机的速度和在某一

光伏电站 毕业设计 开题报告

毕业设计(论文)开题报告 题目新疆哈密东南山口 50Mwp光伏电站设计 专业电力 班级 学生 指导教师 2015 年

一、毕业设计(论文)课题来源、类型 课题来源:由于本人家乡新疆哈密地区光照条件十分优越,故拟在哈密东南山口地区建一个容量为50Mwp的并网光伏电站,经在网上查阅相应的资料后,已搜到相关设计标准和设计流程,可以作为一个研究课题。 类型:理论研究 二、选题的目的及意义 2.1太阳能的优势 太阳能作为一种新型的绿色可再生能源,与其他新能源相比利用最大,是最理想的可再生能源。因为它具有以下的特点: (1)数量巨大:每年到达地球表面能供人类利用的太阳辐射相当于一颗原子弹爆炸时所发出的能量; (2)时间长久:用之不竭,太阳按目前功率辐射能量其时间约可持续100亿年; (3)普照大地:取之不尽,不需要开采和运输; (4)清洁无污染:无任何物质的排放,既不会留下污染物,也不会向大气中排放废气。 2.2光伏发电的优势 太阳能的开发利用主要有光热利用、光伏利用、光化学利用等三种形式。目前,以太阳能电池技术为核心的太阳能光伏利用成为太阳能开发利用中最重要的应用领域,因为光伏发电具有以下明显优点:

(1)结构简单,体积小且轻。能独立供电的太阳能电池组件和方阵结构都比较简单,输出50W的晶体硅太阳能电池组件,体积约为450mm×985mm×45mm,质量为7kg。 (2)容易安装运输,建设周期短。只要将太阳能电池支撑并面向太阳即可发电,宜于制成小功率移动电源; (3)维护简单,使用方便。如遇风雨天,只需检查太阳电池表面是否被粘污、接线是否可靠、蓄电池电压是否正常即可。大型光伏电站使用计算机控制运行,运行费用很低。 (4)清洁、安全、无噪声。光伏发电本身不向外界排放废物,没有机械噪声,是一种理想的能源。 (5)可靠性高,寿命长,并且应用范围广。晶体硅太阳能电池的寿命可以长达20至35年,在光伏系统中,只要设计合理、选型适当,蓄电池的寿命可以达到10多年;太阳能几乎无处不在,太阳能电池在中国大部分范围内都能作为独立的电源。 2.3阳能开发潜力 在中国,太阳能资源较好的地区占国土面积2/3以上,主要集中在西部地区,尤其是西北和青藏高原,年平均日照在2200小时以上,中国陆地每年接收的太阳辐射量约合24000亿吨标准煤。太阳能发电虽受昼夜、晴雨、季节的影响,但可以分散的进行,所以它适于各家各户分别进行发电,而且可以连接到供电网络上,使得各个家庭在电力富裕时可将其卖给电力公司,不足时又可以从电力公司买入。分布式光伏发电并网系统将可能是今后住宅和办公用电的主要模式。太阳能发电有更加激动人心的计划。一

大型电商分布式架构设计与优化

大型电商分布式架构设计与优化 本文主题为电商网站架构案例,将介绍如何从电商网站的需求,到单机架构,逐步演变为常用的、可供参考的分布式架构原型。除具备功能需求外,还具备一定的高性能、高可用、可伸缩、可扩展等非功能质量需求(架构目标)。

本文大纲: 1. 使用电商案例的原因 2. 电商网站需求 3. 网站初级架构 4. 系统容量估算 5. 网站架构分析 6. 网站架构优化 根据实际需要,进行改造、扩展、支持千万PV,是没问题的。 使用电商案例的原因 分布式大型网站,目前看主要有几类: 1.大型门户(比如网易、新浪等); 2.SNS网站(比如校内、开心网等); 3.电商网站(比如阿里巴巴、京东商城、国美在线、汽车之家等)。

大型门户一般是新闻类信息,可以使用CDN、静态化等方式优化。而开心网等交互性比较多,可能会引入更多的NoSQL、分布式缓存、使用高性能的通信框架等。电商网站具备以上两类的特点,比如产品详情可以采用CDN,静态化,交互性高的需要采用NoSQL等技术。因此,我们采用电商网站作为案例,进行分析。 电商网站需求 客户需求: ?建立一个全品类的电子商务网站(B2C),用户可以在线购买商品,可以在线支付,也可以货到付款; ?用户购买时可以在线与客服沟通; ?用户收到商品后,可以给商品打分和评价; ?目前有成熟的进销存系统,需要与网站对接; ?希望能够支持3~5年,业务的发展; ?预计3~5年用户数达到1000万; ?定期举办双11、双12、三八男人节等活动; ?其他的功能参考京东或国美在线等网站。 客户就是客户,不会告诉你具体要什么,只会告诉你他想要什么,我们很多时候要引导、挖掘客户的需求。好在提供了明确的参考网站。因此,下一步要进行大量的分析,结合行业以及参考网站,给客户提供方案。其它的这里暂不展开。

速度控制系统设计外文翻译

译文 流体传动及控制技术已经成为工业自动化的重要技术,是机电一体化技术的核心组成之一。而电液比例控制是该门技术中最具生命力的一个分支。比例元件对介质清洁度要求不高,价廉,所提供的静、动态响应能够满足大部分工业领域的使用要求,在某些方面已经毫不逊色于伺服阀。比例控制技术具有广阔的工业应用前景。但目前在实际工程应用中使用电液比例阀构建闭环控制系统的还不多,其设计理论不够完善,有待进一步的探索,因此,对这种比例闭环控制系统的研究有重要的理论价值和实践意义。本论文以铜电解自动生产线中的主要设备——铣耳机作为研究对象,在分析铣耳机组各构成部件的基础上,首先重点分析了铣耳机的关键零件——铣刀的几何参数、结构及切削性能,并进行了实验。用电液比例方向节流阀、减压阀、直流直线测速传感器等元件设计了电液比例闭环速度控制系统,对铣耳机纵向进给装置的速度进行控制。论文对多个液压阀的复合作用作了理论上的深入分析,着重建立了带压差补偿型的电液比例闭环速度控制系统的数学模型,利用计算机工程软件,研究分析了系统及各个组成环节的静、动态性能,设计了合理的校正器,使设计系统性能更好地满足实际生产需要 水池拖车是做船舶性能试验的基本设备,其作用是拖曳船模或其他模型在试验水池中作匀速运动,以测量速度稳定后的船舶性能相关参数,达到预报和验证船型设计优劣的目的。由于拖车稳速精度直接影响到模型运动速度和试验结果的精度,因而必须配有高精度和抗扰性能良好的车速控制系统,以保证拖车运动的稳速精度。本文完成了对试验水池拖车全数字直流调速控制系统的设计和实现。本文对试验水池拖车工作原理进行了详细的介绍和分析,结合该控制系统性能指标要求,确定采用四台直流电机作为四台车轮的驱动电机。设计了电流环、转速环双闭环的直流调速控制方案,并且采用转矩主从控制模式有效的解决了拖车上四台直流驱动电机理论上的速度同步和负载平衡等问题。由于拖车要经常在轨道上做反复运动,拖动系统必须要采用可逆调速系统,论文中重点研究了逻辑无环流可逆调速系统。大型直流电机调速系统一般采用晶闸管整流技术来实现,本文给出了晶闸管整流装置和直流电机的数学模型,根据此模型分别完成了电流坏和转速环的设计和分析验证。针对该系统中的非线性、时变性和外界扰动等因素,本文将模糊控制和PI控制相结合,设计了模糊自整定PI控制器,并给出了模糊控制的查询表。本文在系统基本构成及工程实现中,介绍了西门子公司生产的SIMOREGDC Master 6RA70全数字直流调速装置,并设计了该调速装置的启动操作步骤及参数设置。完成了该系统的远程监控功能设计,大大方便和简化了对试验水池拖车的控制。对全数字直流调速控制系统进行了EMC设计,提高了系统的抗干扰能力。本文最后通过数字仿真得到了该系统在常规PI控制器和模糊自整定PI控制器下的控制效果,并给出了系统在现场调试运行时的试验结果波形。经过一段时间的试运行工作证明该系统工作良好,达到了预期的设计目的。 提升装置在工业中应用极为普遍,其动力机构多采用电液比例阀或电液伺服阀控制液压马达或液压缸,以阀控马达或阀控缸来实现上升、下降以及速度控制。电液比例控制和电液伺服控制投资成本较高,维护要求高,且提升过程中存在速度误差及抖动现象,影响了正常生产。为满足生产要求,提高生产效率,需要研究一种新的控制方法来解决这些不足。随着科学技术的飞速发展,计算机技术在液压领域中的应用促进了电液数字控制技术的产生和发展,也使液压元件的数字化成为液压技术发展的必然趋势。本文以铅电解残阳极洗涤生产线中的提升装置为研究

太阳能光伏系统蓄电池充电中英文对照外文翻译文献

(文档含英文原文和中文翻译) 中英文对照外文翻译 Design of a Lead-Acid Battery Charging and Protecting IC in Photovoltaic System 1.Introduction Solar energy as an inexhaustible, inexhaustible source of energy more and more attention. Solar power has become popular in many countries and regions, solar lighting has also been put into use in many cities in China. As a key part of the solar lighting, battery charging and protection is particularly important. Sealed maintenance-free lead-acid battery has a sealed, leak-free, pollution-free, maintenance-free, low-cost, reliable power supply during the entire life of the battery voltage is stable and no maintenance, the need for uninterrupted for the various types

of has wide application in power electronic equipment, and portable instrumentation. Appropriate float voltage, in normal use (to prevent over-discharge, overcharge, over-current), maintenance-free lead-acid battery float life of up to 12 ~ 16 years float voltage deviation of 5% shorten the life of 1/2. Thus, the charge has a major impact on this type of battery life. Photovoltaic, battery does not need regular maintenance, the correct charge and reasonable protection, can effectively extend battery life. Charging and protection IC is the separation of the occupied area and the peripheral circuit complexity. Currently, the market has not yet real, charged with the protection function is integrated on a single chip. For this problem, design a set of battery charging and protection functions in one IC is very necessary. 2.System design and considerations The system mainly includes two parts: the battery charger module and the protection module. Of great significance for the battery as standby power use of the occasion, It can ensure that the external power supply to the battery-powered, but also in the battery overcharge, over-current and an external power supply is disconnected the battery is to put the state to provide protection, the charge and protection rolled into one to make the circuit to simplify and reduce valuable product waste of resources. Figure 1 is a specific application of this Ic in the photovoltaic power generation system, but also the source of this design. Figure1 Photovoltaic circuit system block diagram Maintenance-free lead-acid battery life is usually the cycle life and float life factors affecting the life of the battery charge rate, discharge rate, and float voltage. Some manufacturers said that if the overcharge protection circuit, the charging rate can be achieved even more than 2C (C is the rated capacity of the battery), battery manufacturers recommend charging rate of C/20 ~ C/3. Battery voltage and temperature, the temperature is increased by 1 °C, single cell battery voltage drops 4 mV , negative temperature coefficient of -4 mV / ° C means that the battery float voltage. Ordinary charger for the best working condition at 25 °C; charge less than the ambient temperature of 0 °C; at 45 °C may shorten the battery life due to severe overcharge. To make the battery to extend the working life, have a certain solar battery array Charge controller controller Discharge controller DC load accumulator

太阳能光伏电池论文中英文资料对照外文翻译文献综述

光伏系统中蓄电池的充电保护IC电路设计 1.引言 太阳能作为一种取之不尽、用之不竭的能源越来越受到重视。太阳能发电已经在很多国家和地区开始普及,太阳能照明也已经在我国很多城市开始投入使用。作为太阳能照明的一个关键部分,蓄电池的充电以及保护显得尤为重要。由于密封免维护铅酸蓄电池具有密封好、无泄漏、无污染、免维护、价格低廉、供电可靠,在电池的整个寿命期间电压稳定且不需要维护等优点,所以在各类需要不间断供电的电子设备和便携式仪器仪表中有着广泛的应用。采用适当的浮充电压,在正常使用(防止过放、过充、过流)时,免维护铅酸蓄电池的浮充寿命可达12~16年,如果浮充电压偏差5%则使用寿命缩短1/2。由此可见,充电方式对这类电池的使用寿命有着重大的影响。由于在光伏发电中,蓄电池无需经常维护,因此采用正确的充电方式并采用合理的保护方式,能有效延长蓄电池的使用寿命。传统的充电和保护IC是分立的,占用而积大并且外围电路复杂。目前,市场上还没有真正的将充电与保护功能集成于单一芯片。针对这个问题,设计一种集蓄电池充电和保护功能于一身的IC是十分必要的。 2.系统设计与考虑 系统主要包括两大部分:蓄电池充电模块和保护模块。这对于将蓄电池作为备用电源使用的场合具有重要意义,它既可以保证外部电源给蓄电池供电,又可以在蓄电池过充、过流以及外部电源断开蓄电池处于过放状态时提供保护,将充电和保护功能集于一身使得电路简化,并且减少宝贵的而积资源浪费。图1是此Ic在光伏发电系统中的具体应用,也是此设计的来源。 免维护铅酸蓄电池的寿命通常为循环寿命和浮充寿命,影响蓄电池寿命的因

素有充电速率、放电速率和浮充电压。某些厂家称如果有过充保护电路,充电率可以达到甚至超过2C(C为蓄电池的额定容量),但是电池厂商推荐的充电率是C/20~C/3。电池的电压与温度有关,温度每升高1℃,单格电池电压下降4 mV,也就是说电池的浮充电压有负的温度系数-4 mV/℃。普通充电器在25℃处为最佳工作状态;在环境温度为0℃时充电不足;在45℃时可能因严重过充电缩短电池的使用寿命。要使得蓄电池延长工作寿命,对蓄电池的工作状态要有一定的了解和分析,从而实现对蓄电池进行保护的目的。蓄电池有四种工作状态:通常状态、过电流状态、过充电状态、过放电状态。但是由于不同的过放电电流对蓄电池的容量和寿命所产生的影响不尽相同,所以对蓄电池的过放电电流检测也要分别对待。当电池处于过充电状态的时间较长,则会严重降低电池的容量,缩短电池的寿命。当电池处于过放电状态的时间超过规定时间,则电池由于电池电压过低可能无法再充电使用,从而使得电池寿命降低。 根据以上所述,充电方式对免维护铅酸蓄电池的寿命有很大影响,同时为了使电池始终处于良好的工作状态,蓄电池保护电路必须能够对电池的非正常工作状态进行检测,并作出动作以使电池能够从不正常的工作状态回到通常工作状态,从而实现对电池的保护。 3.单元模块设计 3.1充电模块 芯片的充电模块框图如图2所示。该电路包括限流比较器、电流取样比较器、基准电压源、欠压检测电路、电压取样电路和逻辑控制电路。 该模块内含有独立的限流放大器和电压控制电路,它可以控制芯片外驱动器,驱动器提供的输出电流为20~30 mA,可直接驱动外部串联的调整管,从

Hadoop分布式文件系统:架构和设计

Hadoop分布式文件系统:架构和设计 引言 (2) 一前提和设计目标 (2) 1 hadoop和云计算的关系 (2) 2 流式数据访问 (2) 3 大规模数据集 (2) 4 简单的一致性模型 (3) 5 异构软硬件平台间的可移植性 (3) 6 硬件错误 (3) 二HDFS重要名词解释 (3) 1 Namenode (4) 2 secondary Namenode (5) 3 Datanode (6) 4 jobTracker (6) 5 TaskTracker (6) 三HDFS数据存储 (7) 1 HDFS数据存储特点 (7) 2 心跳机制 (7) 3 副本存放 (7) 4 副本选择 (7) 5 安全模式 (8) 四HDFS数据健壮性 (8) 1 磁盘数据错误,心跳检测和重新复制 (8) 2 集群均衡 (8) 3 数据完整性 (8) 4 元数据磁盘错误 (8) 5 快照 (9)

引言 云计算(cloud computing),由位于网络上的一组服务器把其计算、存储、数据等资源以服务的形式提供给请求者以完成信息处理任务的方法和过程。在此过程中被服务者只是提供需求并获取服务结果,对于需求被服务的过程并不知情。同时服务者以最优利用的方式动态地把资源分配给众多的服务请求者,以求达到最大效益。 Hadoop分布式文件系统(HDFS)被设计成适合运行在通用硬件(commodity hardware)上的分布式文件系统。它和现有的分布式文件系统有很多共同点。但同时,它和其他的分布式文件系统的区别也是很明显的。HDFS是一个高度容错性的系统,适合部署在廉价的机器上。HDFS 能提供高吞吐量的数据访问,非常适合大规模数据集上的应用。 一前提和设计目标 1 hadoop和云计算的关系 云计算由位于网络上的一组服务器把其计算、存储、数据等资源以服务的形式提供给请求者以完成信息处理任务的方法和过程。针对海量文本数据处理,为实现快速文本处理响应,缩短海量数据为辅助决策提供服务的时间,基于Hadoop云计算平台,建立HDFS分布式文件系统存储海量文本数据集,通过文本词频利用MapReduce原理建立分布式索引,以分布式数据库HBase 存储关键词索引,并提供实时检索,实现对海量文本数据的分布式并行处理.实验结果表 明,Hadoop框架为大规模数据的分布式并行处理提供了很好的解决方案。 2 流式数据访问 运行在HDFS上的应用和普通的应用不同,需要流式访问它们的数据集。HDFS的设计中更多的考虑到了数据批处理,而不是用户交互处理。比之数据访问的低延迟问题,更关键的在于数据访问的高吞吐量。 3 大规模数据集 运行在HDFS上的应用具有很大的数据集。HDFS上的一个典型文件大小一般都在G字节至T字节。因此,HDFS被调节以支持大文件存储。它应该能提供整体上高的数据传输带宽,能在一个集群里扩展到数百个节点。一个单一的HDFS实例应该能支撑数以千万计的文件。

毕业设计外文翻译---控制系统介绍

英文原文 Introductions to Control Systems Automatic control has played a vital role in the advancement of engineering and science. In addition to its extreme importance in space-vehicle, missile-guidance, and aircraft-piloting systems, etc, automatic control has become an important and integral part of modern manufacturing and industrial processes. For example, automatic control is essential in such industrial operations as controlling pressure, temperature, humidity, viscosity, and flow in the process industries; tooling, handling, and assembling mechanical parts in the manufacturing industries, among many others. Since advances in the theory and practice of automatic control provide means for attaining optimal performance of dynamic systems, improve the quality and lower the cost of production, expand the production rate, relieve the drudgery of many routine, repetitive manual operations etc, most engineers and scientists must now have a good understanding of this field. The first significant work in automatic control was James Watt’s centrifugal governor for the speed control of a steam engine in the eighteenth century. Other significant works in the early stages of development of control theory were due to Minorsky, Hazen, and Nyquist, among many others. In 1922 Minorsky worked on automatic controllers for steering ships and showed how stability could be determined by the differential equations describing the system. In 1934 Hazen, who introduced the term “ervomechanisms”for position control systems, discussed design of relay servomechanisms capable of closely following a changing input. During the decade of the 1940’s, frequency-response methods made it possible for engineers to design linear feedback control systems that satisfied performance requirements. From the end of the 1940’s to early 1950’s, the root-locus method in control system design was fully developed. The frequency-response and the root-locus methods, which are the

相关文档
相关文档 最新文档