Last modified by Drunk Monkey on 2024-09-01 12:39

From version 4.3
edited by Drunk Monkey
on 2024-09-01 08:54
Change comment: There is no comment for this version
To version 5.1
edited by Drunk Monkey
on 2024-09-01 08:57
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -92,17 +92,20 @@
92 92  
93 93  Notice that "mirror-0" is now the VDEV, with each physical device managed by it. As mentioned earlier, this would be analogous to a Linux software RAID "/dev/md0" device representing the four physical devices. Let's now clean up our pool, and create another.
94 94  
95 -{{{# zpool destroy tank}}}
95 +{{code language="bash session"}}
96 +# zpool destroy tank
97 +{{/code}}
96 96  
97 97  == Nested VDEVs ==
98 98  
99 99  VDEVs can be nested. A perfect example is a standard RAID-1+0 (commonly referred to as "RAID-10"). This is a stripe of mirrors. In order to specify the nested VDEVs, I just put them on the command line in order (emphasis mine):
100 100  
101 -{{{# zpool create tank mirror sde sdf mirror sdg sdh
103 +{{code language="bash session"}}
104 +# zpool create tank mirror sde sdf mirror sdg sdh
102 102  # zpool status
103 103   pool: tank
104 104   state: ONLINE
105 - scan: none requested
108 + scan: none requested
106 106  config:
107 107  
108 108   NAME STATE READ WRITE CKSUM
... ... @@ -114,22 +114,27 @@
114 114   sdg ONLINE 0 0 0
115 115   sdh ONLINE 0 0 0
116 116  
117 -errors: No known data errors}}}
120 +errors: No known data errors
121 +{{/code}}
118 118  
123 +
119 119  The first VDEV is "mirror-0" which is managing /dev/sde and /dev/sdf. This was done by calling "mirror sde sdf". The second VDEV is "mirror-1" which is managing /dev/sdg and /dev/sdh. This was done by calling "mirror sdg sdh". Because VDEVs are always dynamically striped, "mirror-0" and "mirror-1" are striped, thus creating the RAID-1+0 setup. Don't forget to cleanup before continuing:
120 120  
121 -{{{# zpool destroy tank}}}
126 +{{code language="bash session"}}
127 +# zpool destroy tank
128 +{{/code}}
122 122  
123 123  == File VDEVs ==
124 124  
125 125  As mentioned, pre-allocated files can be used fer setting up zpools on your existing ext4 filesystem (or whatever). It should be noted that this is meant entirely for testing purposes, and not for storing production data. Using files is a great way to have a sandbox, where you can test compression ratio, the size of the deduplication table, or other things without actually committing production data to it. When creating file VDEVs, you cannot use relative paths, but must use absolute paths. Further, the image files must be preallocated, and not sparse files or thin provisioned. Let's see how this works:
126 126  
127 -{{{# for i in {1..4}; do dd if=/dev/zero of=/tmp/file$i bs=1G count=4 &> /dev/null; done
134 +{{code language="bash session"}}
135 +# for i in {1..4}; do dd if=/dev/zero of=/tmp/file$i bs=1G count=4 &> /dev/null; done
128 128  # zpool create tank /tmp/file1 /tmp/file2 /tmp/file3 /tmp/file4
129 129  # zpool status tank
130 130   pool: tank
131 131   state: ONLINE
132 - scan: none requested
140 + scan: none requested
133 133  config:
134 134  
135 135   NAME STATE READ WRITE CKSUM
... ... @@ -139,21 +139,25 @@
139 139   /tmp/file3 ONLINE 0 0 0
140 140   /tmp/file4 ONLINE 0 0 0
141 141  
142 -errors: No known data errors}}}
150 +errors: No known data errors
151 +{{/code}}
143 143  
144 144  In this case, we created a RAID-0. We used preallocated files using /dev/zero that are each 4GB in size. Thus, the size of our zpool is 16 GB in usable space. Each file, as with our first example using disks, is a VDEV. Of course, you can treat the files as disks, and put them into a mirror configuration, RAID-1+0, RAIDZ-1 (coming in the next post), etc.
145 145  
146 -{{{# zpool destroy tank}}}
155 +{{code language="bash session"}}
156 +# zpool destroy tank
157 +{{/code}}
147 147  
148 148  == Hybrid pools ==
149 149  
150 150  This last example should show you the complex pools you can setup by using different VDEVs. Using our four file VDEVs from the previous example, and our four disk VDEVs /dev/sde through /dev/sdh, let's create a hybrid pool with cache and log drives. Again, I emphasized the nested VDEVs for clarity:
151 151  
152 -{{{# zpool create tank mirror /tmp/file1 /tmp/file2 mirror /tmp/file3 /tmp/file4 log mirror sde sdf cache sdg sdh
163 +{{code language="bash session"}}
164 +# zpool create tank mirror /tmp/file1 /tmp/file2 mirror /tmp/file3 /tmp/file4 log mirror sde sdf cache sdg sdh
153 153  # zpool status tank
154 154   pool: tank
155 155   state: ONLINE
156 - scan: none requested
168 + scan: none requested
157 157  config:
158 158  
159 159   NAME STATE READ WRITE CKSUM
... ... @@ -172,22 +172,26 @@
172 172   sdg ONLINE 0 0 0
173 173   sdh ONLINE 0 0 0
174 174  
175 -errors: No known data errors}}}
187 +errors: No known data errors
188 +{{/code}}
176 176  
177 177  There's a lot going on here, so let's disect it. First, we created a RAID-1+0 using our four preallocated image files. Notice the VDEVs "mirror-0" and "mirror-1", and what they are managing. Second, we created a third VDEV called "mirror-2" that actually is not used for storing data in the pool, but is used as a ZFS intent log, or ZIL. We'll cover the ZIL in more detail in another post. Then we created two VDEVs for caching data called "sdg" and "sdh". The are standard disk VDEVs that we've already learned about. However, they are also managed by the "cache" VDEV. So, in this case, we've used 6 of the 7 VDEVs listed above, the only one missing is "spare".
178 178  
179 179  Noticing the indentation will help you see what VDEV is managing what. The "tank" pool is comprised of the "mirror-0" and "mirror-1" VDEVs for long-term persistent storage. The ZIL is magaged by "mirror-2", which is comprised of /dev/sde and /dev/sdf. The read-only cache VDEV is managed by two disks, /dev/sdg and /dev/sdh. Neither the "logs" nor the "cache" are long-term storage for the pool, thus creating a "hybrid pool" setup.
180 180  
181 -{{{# zpool destroy tank}}}
194 +{{code language="bash session"}}
195 +# zpool destroy tank
196 +{{/code}}
182 182  
183 183  == Real life example ==
184 184  
185 185  In production, the files would be physical disk, and the ZIL and cache would be fast SSDs. Here is my current zpool setup which is storing this blog, among other things:
186 186  
187 -{{{# zpool status pool
202 +{{code language="bash session"}}
203 +# zpool status pool
188 188   pool: pool
189 189   state: ONLINE
190 - scan: scrub repaired 0 in 2h23m with 0 errors on Sun Dec 2 02:23:44 2012
206 + scan: scrub repaired 0 in 2h23m with 0 errors on Sun Dec 2 02:23:44 2012
191 191  config:
192 192  
193 193   NAME STATE READ WRITE CKSUM
... ... @@ -205,19 +205,22 @@
205 205   ata-OCZ-REVODRIVE_OCZ-33W9WE11E9X73Y41-part2 ONLINE 0 0 0
206 206   ata-OCZ-REVODRIVE_OCZ-X5RG0EIY7MN7676K-part2 ONLINE 0 0 0
207 207  
208 -errors: No known data errors}}}
224 +errors: No known data errors
225 +{{/code}}
209 209  
210 210  Notice that my "logs" and "cache" VDEVs are OCZ Revodrive SSDs, while the four platter disks are in a RAIDZ-1 VDEV (RAIDZ will be discussed in the next post). However, notice that the name of the SSDs is "ata-OCZ-REVODRIVE_OCZ-33W9WE11E9X73Y41-part1", etc. These are found in /dev/disk/by-id/. The reason I chose these instead of "sdb" and "sdc" is because the cache and log devices don't necessarily store the same ZFS metadata. Thus, when the pool is being created on boot, they may not come into the pool, and could be missing. Or, the motherboard may assign the drive letters in a different order. This isn't a problem with the main pool, but is a big problem on GNU/Linux with logs and cached devices. Using the device name under /dev/disk/by-id/ ensures greater persistence and uniqueness.
211 211  
212 212  Also do notice the simplicity in the implementation. Consider doing something similar with LVM, RAID and ext4. You would need to do the following:
213 213  
214 -{{{# mdadm -C /dev/md0 -l 0 -n 4 /dev/sde /dev/sdf /dev/sdg /dev/sdh
231 +{{code language="bash session"}}
232 +# mdadm -C /dev/md0 -l 0 -n 4 /dev/sde /dev/sdf /dev/sdg /dev/sdh
215 215  # pvcreate /dev/md0
216 216  # vgcreate /dev/md0 tank
217 217  # lvcreate -l 100%FREE -n videos tank
218 218  # mkfs.ext4 /dev/tank/videos
219 219  # mkdir -p /tank/videos
220 -# mount -t ext4 /dev/tank/videos /tank/videos}}}
238 +# mount -t ext4 /dev/tank/videos /tank/videos
239 +{{/code}}
221 221  
222 222  The above was done in ZFS (minus creating the logical volume, which will get to later) with one command, rather than seven.
223 223  
... ... @@ -226,7 +226,6 @@
226 226  This should act as a good starting point for getting the basic understanding of zpools and VDEVs. The rest of it is all downhill from here. You've made it over the "big hurdle" of understanding how ZFS handles pooled storage. We still need to cover RAIDZ levels, and we still need to go into more depth about log and cache devices, as well as pool settings, such as deduplication and compression, but all of these will be handled in separate posts. Then we can get into ZFS filesystem datasets, their settings, and advantages and disagvantages. But, you now have a head start on the core part of ZFS pools.
227 227  
228 228  ----
229 -
230 230  (% style="text-align: center;" %)
231 231  Posted by Aaron Toponce on Tuesday, December 4, 2012, at 6:00 am.
232 232  Filed under [[Debian>>url:https://web.archive.org/web/20210430213532/https://pthree.org/category/debian/]], [[Linux>>url:https://web.archive.org/web/20210430213532/https://pthree.org/category/linux/]], [[Ubuntu>>url:https://web.archive.org/web/20210430213532/https://pthree.org/category/ubuntu/]], [[ZFS>>url:https://web.archive.org/web/20210430213532/https://pthree.org/category/zfs/]].
... ... @@ -233,7 +233,6 @@
233 233  Follow any responses to this post with its [[comments RSS>>url:https://web.archive.org/web/20210430213532/https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/feed/]] feed.
234 234  You can [[post a comment>>url:https://web.archive.org/web/20210430213532/https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/#respond]] or [[trackback>>url:https://web.archive.org/web/20210430213532/https://pthree.org/2012/12/04/zfs-administration-part-i-vdevs/trackback/]] from your blog.
235 235  For IM, Email or Microblogs, here is the [[Shortlink>>url:https://web.archive.org/web/20210430213532/https://pthree.org/?p=2584]].
236 -
237 237  ----
238 238  
239 239  {{box title="**Archived From:**"}}