2009年6月3日星期三

轉: about raw device

Q: 1. What is a raw device?

A:Raw device, also known as a rawpartition is a disk partition that is not mounted and written by Linuxfilesystem (ext2/ext3, reiserfs) or by Oracle Cluster File System(OCFS), but is accessed by a character device driver. It is theresponsibility of the application to organize how the data is writtento the disk partition.

[top]

Q: 2. How can a raw device be recognised?

A:Allhardware devices look like regular files; they can be opened, closed,read and written using the same, standard, system calls that are usedto manipulate files. Every device in the system is represented by adevice special file, for example the first IDE disk in the system isrepresented by /dev/hda. For block (disk) and character devices, thesedevice special files are created by the mknod command and they describethe device using major and minor device numbers.
All devices controlled by the same device driver have a common major device number.
Theminor device numbers are used to distinguish between different devicesand their controllers, for example each partition on the primary IDEdisk has a different minor device number. So, /dev/hda2, the secondpartition of the primary IDE disk has a major number of 3 and a minornumber of 2. Linux maps the device special file passed in system calls(say to mount a file system on a block device) to the device's devicedriver using the major device number and a number of system tables, forexample the character device table, chrdevs .

RedHat AS supports three types of hardware device: character, block and network.

1. Character devices are read and written directly without buffering.

2. Block devices can only be written to and read from in multiples ofthe block size, typically 512 or 1024 bytes. Block devices are accessedvia the buffer cache and may be randomly accessed, that is to say, anyblock can be read or written no matter where it is on the device. Blockdevices can be accessed via their device special file but more commonlythey are accessed via the file system. Only a block device can supporta mounted file system.

3. Network devices are accessed viathe BSD socket interface and the networking subsytems described in theNetworking chapter.

The Raw devices are character devices (major number 162).
The first minor number (i.e. 0) is reserved as a control interface and is usually found at /dev/rawctl.
A sequence of commands listing the raw devices:

# ls -lR /dev/rawctl
crw-rw---- 1 root disk 162, 0 Mar 19 2002 /dev/rawctl

# ls -lR /dev/raw[1-4]
crw-rw---- 1 root disk 162, 1 Mar 19 2002 /dev/raw1
crw-rw---- 1 root disk 162, 2 Mar 19 2002 /dev/raw2
crw-rw---- 1 root disk 162, 3 Mar 19 2002 /dev/raw3
crw-rw---- 1 root disk 162, 4 Mar 19 2002 /dev/raw4

[top]

Q: 3. What are the benefits of raw devices?

A:Araw device can be bound to an existing block device (e.g. a disk) andbe used to perform "raw" IO with that existing block device.
Such"raw" IO bypasses the caching (Linux buffer cache) that is normallyassociated with block devices and eliminates the file system overheadssuch as inodes or free lists. Hence a raw device offers a more "direct"route to the physical device and allows an application more controlover the timing of IO to that physical device. This makes raw devicessuitable for complex applications like Database Management Systems thattypically do their own caching.
If there is no I/O bottleneck, rawdevices will not help. Note that the overall amount of I/O is notreduced; it is just done more efficiently.

[top]

Q: 4. Are there circumstances when raw devices have to be used?

A:Ifyou are using the Oracle Parallel Server (OPS) or Oracle RealApplication Cluster (RAC) without Oracle Cluster File System (OCFS),all data files, control files, and redo log files must be placed on rawpartitions so they can be shared between nodes. Also if you use ListI/O or Asynchronous I/O, these facilities allow a program to issuemultiple write operations without having to wait for the return of theprevious write, to take advantage of this data files will need to be onraw devices.

[top]

Q: 5. Can I use the entire raw partition for Oracle?

A:No.You should specify a tablespace slightly smaller in size than the rawpartition size, specifically at least two Oracle block sizes smaller.

[top]

Q: 6. How many raw devices I have in RedHat AS by default and how many raw can I have?

A:RedHat AS operating system limits the number of raw devices that Linux can access to 255.
By default on RedHat Advanced Server there are 128 raw devices under /dev/raw:

# ls -l /dev/raw*
crw-rw---- 1 root disk 162, 1 Mar 19 2002 /dev/raw1
(...)
crw-rw---- 1 root disk 162, 3 Mar 19 2002 /dev/raw128

Linuxcannot handle more than a limited number of partitions per drive. So inLinux you have 4 primary partitions (3 of them useable, if you areusing logical partitions) and at most 15 partitions altogether on anSCSI disk (63 altogether on an IDE disk).

[top]

Q: 7. How can I create new raw devices?

A: If it's necessary create others raw devices the following command must be done as root user (see man mknod):

# mknod -m 660 /dev/raw/rawXXX c 162 XXX
# chown root:disk /dev/raw/rawXXX
(where XXX= 128< integer < 256)

i.e.:
# mknod -m 660 /dev/raw/raw130 c 162 130
# chown root:disk /dev/raw/raw130
# ls -l /dev/raw/raw130
crw-rw---- 1 root disk 162, 130 Dec 23 18:57 /dev/raw130

[top]

Q: 8. Who should own the raw device?

A:Youwill need to create the raw devices as root, but the ownership shouldbe changed to the 'oracle' account afterwards. The group must also bechanged to the 'dba' group (usually called dba).


[top]
Q: 9. How can I use a raw device for Oracle RDBMS?

A:We suppose to have a SCSI disk drivers - 9 Gbytes. The steps are:

a. Partition the disk driver (/dev/sdb)
b. Binding raw device with partition on new SCSI disk
c. Change the ownership to raw device
d. Create a new Oracle datafile on raw device

- Partion the disk driver, fdisk command (see man fdisk):
1. As user root, type

# fdisk /dev/sdb

2. Type 'p' to see the list of existing partitions on your disk drive:

command (m for help): p
Disk /dev/sdb: 255 heads, 63 sectors, 1174 cylinders
Units = cylinders of 16065 * 512 bytes

Device Boot Start End Block ID System

3.a. In order to create a partition, choose 'n' command and then choose an extended partition with the 'e' option.
You will need extended partition, because this disk will contains more than 4 partitions.
Create partition number 1 first, so choose number 1.

command (m for help): n
command action
e extended
p primary partition (1-4)
e
Partition Number (1-4): 1
First cylinder (1-1115, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-1115, default 1115):
Using default value 1115

3.b. Now within the extended partition,
I will have to create 6 logical partition of equal sizes: each should be 257Mb large (256Mb+1Mb for the headers).
Press 'n' and 'l' and , and write the size of the partition (begin with a +) +257M.
Repeat these steps 6 times

command (m for help): n
command action
l logical (5 or over)
p primary partition (1-4)
l
Partition Number (1-4): 1
First cylinder (1-1115, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-1115, default 1115): +257M
(...repeat 5 time...)

command (m for help): p
Disk /dev/sdb: 255 heads, 63 sectors, 1174 cylinders
Units = cylinders of 16065 * 512 bytes

Device Boot Start End Block ID System
/dev/sdb1 1 1115 8956206 5 Extended
/dev/sdb5 1 33 265009+ 83 Linux
/dev/sdb6 34 66 265041 83 Linux
(...)
/dev/sdb10 166 198 265041 83 Linux

3.c. Now press 'w' this will write the partition table to the disk and quit the fdisk programm

- Binding raw device with partition on new SCSI disk

A utility called raw (see man raw) can be used to bind a raw device to an existing block device:

# raw /dev/raw/raw1 /dev/sdb5
/dev/raw/raw1: bound to major 8, minor 3
(...)
# raw /dev/raw/raw6 /dev/sdb10
/dev/raw/raw6: bound to major 8, minor 3

The last details regarding this is that the assignement of rawdevice drivers to partitions should be done after each startup.
For this reason, as user root, edit the /etc/sysconfig/rawdevices and put the following raw command into it:

raw /dev/raw/raw1 /dev/sdb5
raw /dev/raw/raw2 /dev/sdb6
raw /dev/raw/raw3 /dev/sdb7
raw /dev/raw/raw4 /dev/sdb8
raw /dev/raw/raw5 /dev/sdb9
raw /dev/raw/raw6 /dev/sdb10

- Change the ownership to raw device

As root user type:

# cd /dev/raw
# chown oracle:dba raw[1-4]

- Create a new Oracle datafile on raw device

When using a raw device you need to specify the full pathname in single quotes, and use the REUSE parameter.
When creating the oracle tablespace on the raw partition aslightly smaller size than the actual partition size needs to bespecified.
This size can be calculated as follows:

Size of Redo Log = Raw Partition Size - 1*512 byte block
Size of Data File = Raw Partition Size - 2* Oracle Block Size

e.g. (db_block_size=8192):
create tablespace tablespace_on_raw datafile '/dev/raw/raw1' size 246784K REUSE,
& '/dev/raw/raw2' size 246784K REUSE,
& '/dev/raw/raw3' size 246784K REUSE,
& '/dev/raw/raw4' size 246784K REUSE,
& '/dev/raw/raw5' size 246784K REUSE,
& '/dev/raw/raw6' size 246784K REUSE;

[top]


Q: 10. Does the Oracle block size have any relevance on a raw device?

A:Itis of less importance than for a UNIX file; the size of the Oracleblock can be changed, but it must be a multiple of the physical block
size as it is only possible to seek to physical block boundaries andhence write only in multiples of the physical block size.

[top]

Q: 11. How can I back up my database files if they are on raw devices?

A:You cannot use utilities such as 'tar' or 'cpio', which expect a filesystem to be present.
Usuallypeople move Oracle datafiles from filesystem to raw devices using the'dd' command. Using dd is the fastest method to accomplish it. However,it is necessary to know how many blocks to skip in the raw device (e.g.on Tru64 Unix you have to skip 64K), so that you do not overwriteinformation necessary for the Operating System. The information on howmany blocks to skip is different on the different platforms. Using RMANthere's no necessity to know such platform specific information. Withthe RMAN copy command datafiles can be
copied from filesystem files to raw devices.

# dd if=/dev/raw/raw1 of=/u01/oradata/test_ts.dbf' bs=16K
(Keep the Block size to multiple of the Oracle Block Size)

See the UNIX man page on dd for further details.

You can use RMAN.
From filesystem to raw device:

RMAN> run {
2> allocate channel c1 type disk;
3> copy datafile '/u01/oradata/test_ts.dbf' to '/dev/raw/raw1';
4> }

From raw device to filesystem:

RMAN> run {
2> allocate channel c1 type disk;
3> copy datafile '/dev/raw/raw1' to '/u01/oradata/test_ts.dbf';
4> }

[top]

Q: 12. Providing I am not using Parallel Server or Real Application Cluster, can I use a mixture of raw?

A:Yes. The drawback is that this makes your backup strategy more complicated.

[top]

Q: 13. Should I store my redo log files on raw partitions?

A:Redologs are particularly suitable candidates for being located on rawpartitions, as they are write-intensive and in addition are written to
sequentially. If OPS or RAC is being used, redo logs must be stored on raw partitions.

[top]

Q: 14. Can I use raw partitions for archive logs?

A:No. Archive logs must be stored on a partition with a UNIX filesystem.

[top]

Q: 15. Can I have more than one data file on a raw partition?

A:No.This means you should be careful when setting up the raw partition. Toosmall a size will necessitate reorganisation when you
run out of space, whereas too large a size will waste any space the file does not use.

[top]

Q: 16. Should my raw partitions be on the same disk device?

A:Thisis inadvisable, as there is likely to be contention. You should placeraw devices on different disks, which should also be on different
controllers.


[top]
Q: 17. Do I need to make my raw partitions all the same size?
A:This is not essential, but it provides flexibility in the event of having to change the database configuration.
[top]
Q: 18. Do I need to change any UNIX kernel parameters if I decide to use raw devices?
A:No

[top]

Q: 19. What other UNIX-level changes could help to improve I/O performance?

A:RAIDand disk mirroring can be beneficial, depending on the applicationcharacteristics, especially whether it is read or write-intensive, ora
mixture.

[top]

Q: 20. How can I gain further performance benefits, after considering all of the above?

A:You will need to buy more disk drives and controllers for your system, to spread the I/O load between devices.

[top]

ref://http://airlgc.blog.51cto.com/161810/26441
http://bbs.chinaunix.net/viewthread.php?tid=1293379

没有评论:

发表评论